Developing Real-Time Data Pipelines with Apache Kafka

September 25, 2015
Recorded at SpringOne2GX 2015 Presenter: Joe Stein Big Data Track Developing Real-Time Data Pipelines with Apache Kafka http://kafka.apache.org/ is an introduction for developers about why and how to use Apache Kafka. Apache Kafka is a publish-subscribe messaging system rethought of as a distributed commit log. Kafka is designed to allow a single cluster to serve as the central data backbone. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of coordinated consumers. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages. For the Spring user, Spring Integration Kafka and Spring XD provide integration with Apache Kafka.
Previous
Spring Data Rest - Data Meets Hypermedia + Security
Spring Data Rest - Data Meets Hypermedia + Security

Recorded at SpringOne2GX 2015 Presenters: Greg Turnquist & Roy Clarkson Data / Integration Track Check out...

Next Presentation
Grooscript in Action
Grooscript in Action

Recorded at SpringOne2GX 2015 Presenter: Jorge Franco Groovy Ecosystem Track Grooscript is a library that ...