Named after a toy elephant belonging to developer Doug Cutting’s son, over the past decade Hadoop has proven to be the little platform that could. From its humble beginnings as an open source search engine project created by Cutting and Mike Cafarella, Hadoop has evolved into a robust platform for Big Data storage and analysis. It manages the deluge of user data for giant social services like Facebook and Twitter, supports sophisticated medical and scientific research, and increasingly addresses the storage and predictive analytics demands of the Enterprise. How did an open source project started by a moonlighting developer and a University of Washington grad student become ubiquitous in so many data-driven settings? In its new four-part series, GigaOm documents Hadoop’s history, its growth, and the promising future of the platform.
Though Hadoop’s roots wind back to Nutch, the 2002 project started by Cutting and Cafarella, three key factors kickstarted the Hadoop we know today. Heavily influenced by Google’s foundational Google File System and MapReduce papers, Cutting joined Yahoo! in 2006, which isolated Nutch’s storage and data processing capabilities within a discrete package named Hadoop. The platform became crucial to the operations of the company’s data science team, if not its search engine. Meanwhile, Hadoop was embraced within the open source community and by developers at companies such as Google and Facebook, accelerating its update cycle and lending the platform additional credibility and battle-tested stability.
The platform has flourished since then, igniting a slew of startups and enjoying considerable investment and development resources. Many from the Yahoo! data science team that developed Hadoop in its early days have ended up at Greenplum. As companies offer turnkey solutions such as Pivotal HD, the enterprise is increasingly adopting the platform for its affordability, stability, and extensibility, with IDC predicting the Hadoop software market will be worth $813 million in 2016. In its series, GigaOm paints a picture of the robust Hadoop ecosystem, looks towards its future, and reflects on the critical moments in its evolution.
About the AuthorMore Content by Paul M. Davis