We are pleased to announce the general availability of Spring Cloud Data Flow 1.6. SCDF is a toolkit for building data integration and real-time data processing pipelines to support IoT and other data-intensive applications. You can download the Local, Cloud Foundry and Kubernetes releases from the Spring repository right now!
What’s new in 1.6?
Spring Cloud Data Flow 1.6 includes enhancements to the ecosystem (related projects), PCF Scheduler integration, Kubernetes 1.10 support, security for composed-tasks, continuous deployment improvements in Skipper, and richer user-experience improvements to the Dashboard. Let’s walk through each enhancement!
Ecosystem & Related Projects
While the framework is in the feature-complete state, we are continuing to enable task/batch-Job orchestration use cases in Spring Cloud Data Flow. For example, to limit the number of concurrent task launches on every upstream event, we needed to query the task repository to determine the currently running task count, and that is added to the framework backed by a queryable API. A streaming pipeline in SCDF can take advantage of it to decide whether or not to launch more tasks given the configured concurrent launch limits.
The Fishtown release-train (v2.1) is graduating towards a GA release. A few notable improvements include enhancements to Apache Kafka and Kafka Streams binder to add support for “state stores”, which can either be configured as time windows, persistent session, or as in-memory key-values.
A frequently discussed topic in the community is the option to pass multiple inputs (i.e., topics/topic-exchanges) to the message broker, so the actual binding is automatically done by the message broker itself. I- it is now supported in RabbitMQ and Apache Kafka binder implementations.
We are also closely looking into the mechanics of consolidating the programming model and refining the binding components within Spring Cloud Stream and Spring Cloud Function. Stay tuned for more on this topic.
The release management is the core value proposition of Skipper, a tool that allows you to discover applications and manage their lifecycle on multiple cloud platforms. We have further optimized the internals of the release tracking and lifecycle management based on recent community feedback.
To reliably handle highly concurrent deployable scenarios, we have Introduced a new configurable option to override connection pooling size, which can be overridden to provide a meaningful default for a given concurrency requirement.
A significant effort to bring all the out-of-the-box apps to Spring Boot 2.0 compatibility is now complete. A handful of new apps officially graduated to a GA release in this release train, too. See: gRPC-processor, image-recognition-processor, and objection-detection-processor.
TensorFlow continues to evolve as the defacto toolkit for modern machine learning practices. Building upon the previous work done in this area, we are exploring new approaches to estimate real-time multi-person human poses. A deep-dive blog on this subject is coming soon.
Working on a @TensorFlow #openPose integration for @SpringCloud DataFlow. Beautiful mixture of deep learning, geometry, physics, linear algebra and pinch of graphs: https://t.co/v6Gps56aW0— Christian Tzolov (@christzolov) June 19, 2018
The demo is on the way ;) pic.twitter.com/FVddiU8Ing
Scheduling Batch Jobs in Pivotal Cloud Foundry
Though we notice and acknowledge the shift from batch to streaming architectures, we also continue to learn new requirements for batch processing. It is not going away anytime soon.
For instance, to address the scheduling requirements for batch use cases, Spring Cloud Scheduler and Spring Cloud Scheduler for Cloud Foundry have joined the Spring Cloud Data Flow ecosystem. The first iteration of this begins with the native PCF Scheduler integration in SCDF’s Cloud Foundry implementation.
The definition of a task/batch pipeline and the launching of the pipeline are two essential steps, and now there’s a new addition to the workflow - a pipeline can now be scheduled with a cron-expression. The PCF Scheduler interacts with the staged task droplet in Cloud Foundry and it launches the task when the cron-evaluated outcome matches the current time. See below a screencast from Glenn Renfro of the SCDF and Scheduler interaction from a developer perspective.
The PCF Scheduler integration is a brand new offering. The ability to schedule, unschedule, and the tracking of schedule history is available in the Dashboard.
The Dashboard users would notice richer UI/UX improvements across all facets. In particular, there’s a noticeable improvement of the fluid workflow interactions between the various tabs and the route managemen. The Jobs and Analytics tabs get a facelift to provide a consistent look and feel.
The UI stack, including Spring Flo, is upgraded to adapt to the latest Angular 6 along with the upgrades to JointJS and other dependencies. We will continue to consume the dependent upstream UI releases as early and frequently as possible.
Composed Tasks and Security
Composed Tasks in SCDF continues to be a popular method by which a directed acyclic graph can be orchestrated in cloud platforms. Based on feedback from the community, this release adds support for secured communication between Composed Task graph.
App Repo Tool
We have noticed a growing trend of users provisioning SCDF in a no-internet infrastructure or an environment with firewalls and proxies. This is particularly noticeable in the finance domain. To address the challenges of SCDF reaching to the internet to resolve Spring Boot apps from an external Maven Repository or Docker Registry, we have added a standalone App Repo Tool, which mimics the app serving functionality. As an independent app repository, it can be provisioned to run alongside SCDF in a network constrained environment. The apps can be negotiated, resolved and downloaded from the App Repo Tool at the time of stream or task deployment. The standalone tool can be used in Local, Cloud Foundry or Kubernetes.
Cloud Foundry and Kubernetes
The Spring team continues to focus on Cloud Foundry and Kubernetes as primary development and runtime environments (with FaaS quickly gaining momentum, too, but that’s the focus of a different blog post!) The version compatibility (e.g., PCF 2.2 and Kubernetes 1.10) and environment-specific feature parity (e.g., StreamSets and API maturity between Kubernetes versions) is the core of the product focus, and we are open to feedback and feature requests.
The Spring Cloud Data Flow for PCF tile is closely catching up to the open source releases and it continues to add value through end-to-end operational and security automation. Likewise, the Helm chart is available to provision SCDF and the companion servers onto a Kubernetes cluster with a single command.
Join the Community!
About the Author
Sabby Anandan, Product Manager in the Spring Team at Pivotal. He focuses on building products that address the challenges faced with iterative development and operationalization of data-intensive applications at scale. Before joining Pivotal, Sabby worked in engineering and management consulting positions. He holds a Bachelor's degree in Electrical and Electronics from the University of Madras and a Master's degree in Information Technology and Management from the Carnegie Mellon University.Follow on Twitter More Content by Sabby Anandan