Open Source Analytics Library MADlib Receives Site Relaunch, v1.4 Release

December 23, 2013 Paul M. Davis

MADlib_logo MADlib, the open source analytics library shepherded by Pivotal data scientists and UC Berkeley researchers, gets a fresh coat of a paint with a major relaunch of the project’s website. Allowing practitioners to perform Big Data analytics within SQL databases, MADlib offers a scalable library of algorithms which provide considerable speed and cost benefits to organizations.

MADlib is the result of conversations that began in 2009 between industry experts and academic researchers to develop new approaches to scalable, sophisticated in-database analytics. The “MAD” in MADlib refers to the library’s “magnetic”, “agile”, and “deep” environment for analysis, well-suited to a wide range of Big Data use cases across various industries. Supporting Postgres, Pivotal Greenplum Database, and Pivotal HAWQ, the open source library receives ongoing development by Pivotal as well as researchers at UC Berkeley, Stanford University, and University of Florida.

The key principles driving MADlib development are:

Operate on the data locally—in database. Do not move it between multiple runtime environments unnecessarily.
Utilize best of breed database engines, but separate the machine learning logic from database specific implementation details.
Leverage MPP Share nothing technology, such as the Pivotal Greenplum Database, to provide parallelism and scalability.
Open implementation maintaining active ties into ongoing academic research

The library’s sophisticated algorithms deliver many of the demands of a data-driven enterprise. In concert with Pivotal HD and HAWQ, MADlib offers “Deep Scalable Analytics,” offering “data-parallel implementations of mathematical, statistical, and machine-learning methods for structured and unstructured data.” Its features include classification, regression, clustering, topic modeling, association rule mining, descriptive statistics, and validation.

Screen Shot 2013-12-20 at 6.58.28 AM

Use cases for MADlib’s robust library of algorithms are varied, applicable to the data science needs of a number of industries including retail, advertising and public relations, financial services, media and telecommunications, manufacturing, energy, government, as well as healthcare and life sciences.

The speed and flexibility offered by MADlib, working in concert with HAWQ, is borne out by a recent case study by Adam Bloom, which demonstrated that this dynamic duo was able to “improve the speed of analysis by over 318x and reduce analytic queries from 24 days to 6 minutes” for one of Pivotal’s retail/e-commerce customers.

To learn more about MADlib, and download distributions of the library for Pivotal Greenplum Database, Linux, or OS X, visit the new MADlib site.

About the Author

Biography

A Preview of CES 2014

While we’re winding down a busy 2013, we’re eagerly anticipating an even better 2014. One of our favorite w...

DI Matters: Structuring Commands for clarity

I like using command objects as the ‘crux’ of my controllers. They encapsulate behavior without making mode...

Open Source Analytics Library MADlib Receives Site Relaunch, v1.4 Release

About the Author

Previous

Next

Open Source Analytics Library MADlib Receives Site Relaunch, v1.4 Release

About the Author

Previous

Next

Related content in this Stream

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.

If you're into FinOps, you've probably heard of FOCUS. Introducing our FOCUS FlexReports template for AWS, Azure, and GCP. Turn your cloud bills into FOCUS-compliant reports in minutes!

The latest Spring Boot simplifies infrastructure setup with Docker Compose. Now, supporting Bitnami images, it opens new possibilities for developers. Exciting times ahead!

Shape the future of Spring! Participate in the State of Spring Survey 2024. Share insights, collaborate with the community, and drive innovation.

Extend Apache Tomcat support with Tanzu Spring Runtime. Seamless transition, enhanced security, and uninterrupted workflow for Java applications.

Welcome to another edition of What’s new with Tanzu Application Catalog. This is a quarterly round up of all things related to Tanzu Application Catalog.

As we stand at the threshold of a new era in data management, Greenplum continues to lead the industry with its commitment to innovation.

Experience enhanced security with Tanzu Application Platform. Elevate your organization's defenses from code to build with SLSA Level 3, image scanning scheduling & automatic upgrades for new patches.

Explore Spring's exceptional NPS score of 75, surpassing industry benchmarks by 18%. Discover why it matters.