Deploy Pivotal’s Hadoop on Docker

March 25, 2015

Watch this video for a demonstration of Pivotal's Hadoop on Docker. While Hadoop is becoming more and more mainstream, many development leaders want to speed up and reduce errors in their development and deployment processes (i.e. devops) by using platforms like PaaS and lightweight runtime containers. One of the most interesting, recent stats in the devops arena is that companies with high performing devops processes can ship code 30x more frequently and complete deployment processes 8,000 times faster. To this end, Docker is a new-but-rising lightweight virtualization solution (more precisely, a lightweight Linux isolation container). Basically, Docker allows you to package and configure a run-time and deploy it on Linux machines—it’s build-once-run-anywhere, isolated like a virtual machine, and runs faster and lighter than traditional VMs. Today, I will show you how two components of the Pivotal Big Data Suite—our Hadoop distribution, Pivotal HD, and our SQL interface on Hadoop, HAWQ—can be quickly and easily set up to run on a developer laptop with Docker. With the Docker model, we can literally turn heavyweight app environments on and off like a light switch! The steps below typically take less than 30 minutes: 1) Download and Import the Docker Images, 2) Run the Docker Containers, 3) SSH in to the Environment to Start Pivotal HD, 4) Test Hadoop’s HDFS and MapReduce, 5) Start HAWQ—SQL on Hadoop, and 6) Test HAWQ. To learn more about Pivotal Big Data Suite, view http://pivotal.io/big-data/pivotal-big-data-suite

Previous
Big Data vs. Climate Change
Big Data vs. Climate Change

Srivatsan Ramanujam, Principal Data Scientist at Pivotal and John Cardente. Distinguished Engineer at EMC t...

Next Video
Push Integration on Pivotal Cloud Foundry
Push Integration on Pivotal Cloud Foundry

This is a small demonstration the Continuous Integration process for Push Notification Service. On the left...