Navigating the Apache Geode Sessions at SpringOne Platform 2017

November 13, 2017 Jagdish Mirani

At last year’s SpringOne Platform conference, I wanted to soak it all in, and I arrived excited, caffeinated, and ready to learn. I didn’t want to miss a thing.

Soon thereafter, I was totally overwhelmed by the number of sessions. So I appreciated the guides and tracks that conference organizers put together to help those who have interest in a specific topic. As such, we’re pleased to have grouped/organized all the Geode sessions, giving Geode enthusiasts a great starting point for building their conference schedules.

If one of your goals is to learn more about Apache Geode, then you’re in luck. We’ve got plenty of Geode content, delivered by practitioners and subject matter experts. We’ve deliberately scheduled the Geode sessions so they don’t overlap and are easily searchable via a Geode tag. We’ve also created a page on the site specifically dedicated to Geode. Hopefully we’ve made it easier for you to stay focused, and leave the conference more accomplished and filled with inspiration.

Please note that registration begins at 1PM, on Monday, Dec 4th. The Geode program will start at 1:30PM with opening comments, followed by the sessions below. The Monday registration is all you need for the entire conference - the main conference starts the next day, Dec 5th, but you will not need to register again.

What follows is my quick take on each Geode session, which will hopefully help you put together your schedule.  So, in chronological order, here we go ..

What: High Performance Cloud Native APIs Using Apache Geode

When: Monday, Dec 4th at 2:00 - 2:30PM PT

In the API economy, the concept of an API contract takes on heightened importance. Both the functional requirements, i.e. the behaviors and results (inputs to and outputs from the API), as well as the non-functional requirements, i.e. performance and availability are important. Apache Geode covers both set of requirements, with a rich set of APIs for accessing data, as well as horizontal scalability and high availability to cover the non-functional (system) requirements.

This session covers more of the ‘how’ with respect to building apps using Geode’s APIs in a test driven development environment.

What: Enable SQL/JDBC Access to Apache Geode/GemFire Using Apache Calcite

When: Monday, Dec 4th at 2:30 - 3:00PM PT

The rise of NoSQL databases has taken up much of our mindshare in recent years, but this is occurring alongside a steady, albeit more modest, growth in relational databases. Transactional workloads operate against structured data, and this is a sweet spot for relational databases. The lingua franca for this style of processing is SQL and enterprises have spent 30-plus years working with SQL.

So, what if you want SQL access to data in Geode? This session brings forth another open source project, Apache Calcite, which allows you to layer a SQL veneer on top of NoSQL data stores like Geode. You’ve heard of mapping relational data to classes and objects in programming languages (ORMs like Hibernate). Think of this as the reverse, i.e. mapping data stored as objects to a relational view that can be accessed via SQL.

What: Apache Geode: How Pymma Uses it as an Efficient Alternative to Kafka-Storm-Spark

When: Monday, Dec 4th at 3:30 - 4:00PM PT

Less is more when it comes to the number and types of technologies needed for a solution. I’m a huge fan of simplicity, and for me this is the developer’s equivalent of Occam's razor (plurality must not be propagated beyond necessity).

Developer productivity declines when a technical architecture calls for the installation and knowledge of several products, and the need to create and maintain multiple integration points while keeping track of product release compatibilities. The introduction of several different technologies also increases costs and the possibility of bugs and performance issues.

Pymma built an event-based system for monitoring OpenESB configurations with Geode as the single technology for capturing and buffering events, analyzing events, and storing the results of this analysis from where it can be queried by monitoring tools. Without Geode, this would have taken the cost and complexity of three products: Kafka-Storm-Spark. Given the headaches with stitching together multiple products, it’s easy to see how Pymma’s  ‘Japanese gardened’ solution has appeal.

What: Real-Time Analytics for Data-Driven Applications

When: Monday, Dec 4th at 4:00 - 4:30PM PT

At the simplest level, modern applications must manage the flow of data through three stages: (1) data is ingested and stored in preparation for processing; (2) the data is processed and analyzed; (3) the output of this analysis is made available to applications in near-real-time. As simple as this sounds, we’ve become accustomed to using multiple technologies for various styles of processing along this chain. For example, it’s commonplace to separate analytics from real-time operational access to data.

Generally speaking, we know that this type of separation of workloads is often, but not always, necessary. In addition to complicating your IT landscape, this type of separation introduces data latency, as data travels across various technologies and integration points. Separation provides depth for each type of workload, but maybe what you need is speed across the information delivery chain.

Ampool used Apache Geode to build its Active Data Store (ADS) - a closed-loop system that ingests data, analyzes it, and presents the insights back to applications where appropriate decisions and actions can be taken. The ability to drive decisions in a closed loop, all from a single in-memory data store, is the key to how ADS delivers value.

What: Exploring Data-Driven, Cognitive Capabilities in Pivotal Cloud Foundry

When: Monday, Dec 4th at 4:30 - 5:00PM PT

“Self-driving” is not just a rally cry for cars, it’s what we want from everything. We don’t want anything - software, products, machines - to lose value because there is a burden related to driving, operating, or administering it. Fortunately, in addition to adding depth to features, good automation makes features available with as little operational overhead as possible.

Analytics can be a boon in our move towards self-driving happiness. If a system can anticipate certain outcomes and make decisions accordingly based on analysis of operational data, then you’ve got some smarts behind the driver.

For Pivotal Cloud Foundry (PCF), the mother lode of operational data and metrics can be found in the Loggregator Firehose - the stream of logs from all apps combined with the metrics data from PCF components. Having access to this data opens up the possibility of applying proactive, cognitive-like capabilities to operations on PCF.  The ability to analyze log data is also useful to developers, who look for patterns related to debugging, performance optimizations, etc. This session presents a reference architecture for accomplishing this.

What: Caching for Microservices - Introduction to Pivotal Cloud Cache

When: Tuesday, Dec 5th, at 3:20PM PT

The shift to microservices-based architectures is one of the drivers behind a growing interest in caching. Microservices are typically distributed across a network, and as such, they are subject to network delays. Enter caching - you can cache data locally so that it can be accessed fast when it’s needed.

Separate from any issues related to the network, data intensive microservices can also become a performance bottleneck for the application because they are the clearinghouse for access to their data. These microservices can cache data for delivering fast performance. Caching also adds a layer of resilience to the system - if a backing store is unavailable, the data from the microservice can be retrieved from its cache.

How data is handled in microservices architectures is a hot topic. In this session, Pulkit Chandra explains how Pivotal Cloud Cache, a caching service on PCF, tackles some of these challenges.

What: Apache Geode Test Automation and Continuous Integration & Deployment (CI-CD)

When: Tuesday, Dec 5th, at 5:00PM PT

Data stores can be a challenge for continuous delivery. With every new deployment of an application, the database has to be migrated before deployment. Existing data has to be adapted to the new structure, which can be very difficult if the data volume is large. If the data model is based on a tight schema, then schema changes need to be backwards compatible.

Changes to the data layer are also difficult to test, because you would need a data layer similar to the production data layer.

In this session, HCSC discusses how it successfully adopted DevOps oriented practices against a distributed system, Apache Geode. Their outcome is a no downtime CI-CD pipeline that greatly improves the Apache Geode DevOps experience.

What: Scaling Spring Boot Applications in Real-Time

When: Wednesday, Dec 6th, at 2:00PM PT

Data access is often cited as one of the most challenging aspects of modern application architectures. Fortunately, the Spring framework has good support for building Spring Boot applications with the Spring Data abstraction. In true Spring fashion, the Spring Data abstraction significantly reduces the amount of boilerplate code required to implement data access layers for various persistence stores. With Spring Data, developers can avoid writing data queries by hand in the SQL dialect of the data store, which is not only cumbersome, but can also lock you into the data store you’ve chosen. Instead, you can declaratively create data access paths that do the same thing.

Each data store has its specific implementation of the Spring Data abstraction. This session will focus on Spring Data Geode. If you’re interested in the differences between the Spring Data Geode and Spring Data GemFire, check out the recent post from John Blum, he is the lead Spring Data Geode developer at Pivotal and also presenter of this session. In the session, John will will go through the full cycle from prototyping an app in your IDE to pushing the app to PCF.

What: Cloud-Native Data: What is it? Will it Solve the Data-DevOps Divide?

When: Wednesday, Dec 6th, at 3:20PM PT

The body of knowledge around best practices for cloud-native architectures has grown steadily. We now have methodologies and principles, like twelve-factor apps and others, that provide clear guidance on how to build applications that are suitable for modern cloud platforms. On the other hand, best practices for provisioning data for cloud-native applications have not been synthesized and articulated to the same degree. This is still an evolving and ongoing discussion.

This session brings together a panel of users, technologists, and industry observers to advance the discussion around cloud-native data. We may not arrive at a cloud-native data manifesto at the end of this discussion, but we’re sure to get some valuable insights from these different perspectives.

What: Spring Driven Industrial IoT Utilizing Edge, Fog, and Cloud Computing

When: Wednesday, Dec 6th, at 4:20PM PT

The Internet of things covers a lot of ground. While the lay person’s perceptions around the value of IoT are being shaped by consumer-oriented devices like wearables and smart home gadgets, the value of IoT takes center stage with the industrial IoT, IoT-connected industrial robots and IoT logistics systems.

This session takes an industrial Spring IoT use case and shows how various deployment models can be used in combination. For instance, the cloud model is not an ideal fit for time-critical, real-time operations because of network latency or a lack of internet connectivity. If you want parts of your system to have quick reflexes, you shoukd consider Fog or Edge computing, which is a decentralized architectural pattern that brings computing resources and application resources closer to the edge. This is the most logical and efficient spot in the continuum between the data source and the cloud for real-time operations and analytics. This is where Apache Geode fits in. Geode’s high performance and multi-site WAN capabilities makes it a natural fit for  distributed processing across a broad geography.

During this presentation, you will see a demonstration, including live manufacturing equipment, of an end-to-end feedback workflow of Industrial Spring IoT.

What: Simplifying Apache Geode with Spring Data

When: Wednesday, Dec 6th, at 4:20PM PT

The annotation-based configuration model in Spring provides a great alternative to xml-based configuration, eliminating the need to load xml-based configuration files to host a Spring web application. Spring makes extensive use of annotations and builds on Spring’s cache abstraction and Spring Data repositories, all of which help you build working Apache Geode client/server applications in minutes.

The presenter, John Blum, is the lead Spring Data developer for Geode at Pivotal, and he’s going to preview the product roadmap and what users can expect next.

What: RDBMS and Apache Geode Data Movement: Low Latency ETL Pipeline By Using Cloud-Native Event Driven Microservices

When: Thursday, Dec 7th, at 11:50AM PT

In this session, Paul Warren and Heather Riddle from HCSC present a legacy systems modernization use case that uses event driven data streams (Spring Cloud Streams) and a messaging backbone (RabbitMQ/Kafka) to synchronize a legacy database with Pivotal GemFire. New applications are then built on top of GemFire. This gives HCSC a non-disruptive way of modernizing legacy infrastructure.

I first wrote about this use case when I heard about it at the Cloud Foundry Summit. There are so many interesting aspects to this use case. This time, the presentation will go even deeper to reveal more best practices and outcomes that can be expected.

As you can see, there is no shortage of great Geode content at SpringOne Platform. Register today and we’ll see you in San Francisco December 4-7.


About the Author

Jagdish Mirani

Jagdish Mirani is an enterprise software executive with extensive experience in Product Management and Product Marketing. Currently he is the Principal Product Marketing Manager for Pivotal’s in-memory data grid product called GemFire. Prior to Pivotal, Jagdish spent 10 years at Oracle in their Data Warehousing and Business Intelligence groups. More recently, Jag was at AgilOne, a startup in the predictive marketing cloud space. Prior to AgilOne, Jag was at Business Objects (now part of SAP), Actuate (now part o OpenText), and NetSuite (now part of Oracle). Jag holds a B.S. in Electrical Engineering and Computer Science from Santa Clara University and an MBA from the U.C. Berkeley Haas School of Business.

More Content by Jagdish Mirani
High Speed Data Ingestion with the GemFire-Greenplum Connector
High Speed Data Ingestion with the GemFire-Greenplum Connector

This shows how the GemFire-Greenplum Connector works. The video also covers key use cases.

Next Video
Introduction to Pivotal GemFire features
Introduction to Pivotal GemFire features

Pivotal GemFire is an in-memory, scale-out, distributed data grid for high-scale custom applications. GemFi...