This Month in Data Science

February 28, 2014 Paul M. Davis

February was a big month for data science, demonstrating the breadth of problems the discipline addresses, as well as its increasingly important role within diverse settings including the enterprise, academic research, government, and more. Here’s our picks for the top data science news of the month, from both Pivotal and the entire field.

Top Data Science News in February 2014

Big data: What to trust – data science or the boss’s sixth sense?

For all the talk about building a data-driven enterprise, a majority of businesses still place a premium on the gut instincts of the people at the top. In this ZDnet article, Toby Wolpe examines the reason why some executives and businesses resist data science, and ways that long-held yet false beliefs can become institutionalized and fail a company over time.

Kicking Off A Data Science Project: 4 To-Dos

For businesses that are more eager to embrace a data-driven approach, veteran data scientist Carla Gentry shares some key insights she learned from making mistakes in this Information Week article. She advises companies to differentiate data science from big data, bridge the communication gap between practitioners and decision makers, familiarize themselves with the available technologies, and enforce clean data standards.

Data Made Beautiful: Weather, Climate and Fracking Water

The art of data visualization is growing more expansive in scope and expressive. Bloomberg offers a roundup of some of the most beautiful and impactful weather- and environment-related visualizations to emerge so far this year. They include a representation of wind and water flows in recent years, the relationship between U.S. fracking wells and water resources, and NASA’s comprehensive visualization of six decades of climate change.

Analytics Power Welfare Change in New Zealand

Social welfare reform is a hotly-debated topic in countries across the globe, but New Zealand has taken an analytics-led approach to optimizing the effectiveness of its welfare programs since 2007. Paula Bennett, the country’s minister of social development, explained at a recent conference how the country has used analytics to quantify the lifetime cost of welfare, increase government transparency, and use predictive risk modeling to address potential “at risk” individuals and provide proactive and preventative support.

Google Offers Free “Making Sense of Data” Online Course

Data literacy is becoming increasingly important for not only experts, but professionals from a wide swath of disciplines—executives, marketers, journalists, non-profit workers, and many more. For the data-curious who never went further than Statistics 101, Google is offering a free MOOC, “Making Sense of Data,” which will introduce participants to the basics of data analytics, familiarity with tools such as Fusion Tables, and finding patterns and relationships.

How To Choose The Right Test Options When Evaluating Machine Learning Algorithms

More advanced students might take much from Jason Brownlee’s extensive library of machine learning tutorials and resources. In a recent post, Brownlee offers a primer on the various types of test options to consider when choosing machine learning algorithms, including training and testing on the same dataset, split tests, cross validation, and statistical significance.

This Month in Pivotal Data Science

Creating the Digital Brain

Oil spills from mining accidents can cost tens of billions per incident. Such economic and environmental disasters could be avoided through smart systems that automate the detection of anomalies and trigger reactions that could avoid future spills. The Pivotal Data Science team revealed the progress they’ve made toward building a digital brain to control drilling rigs for the oil and gas industry, and shows how it could be applied to other industries as well.

Data Driving The Future of Cars: Data Science Innovations in the Automotive Industry

The connected car is a poster child for the “Internet of Things” and ripe with innovations from applying data science to big data. Pivotal Data Labs has been doing extensive work in the sector and shares some of the recent learnings from auto manufacturers that can also be applied to other industries.

17 New Big Data Things To Talk About Since Strata 2013

Strata is all about the future of big data and data science—exactly the same reasons we formed Pivotal to solve, which is why Pivotal was an Elite sponsor for Strata 2013. Since the last Strata conference, we have been very, very busy spinning out, innovating and solving big data and data science challenges around the globe. For a taste of what we’ve been up to, check out the full post.

Upcoming Data Science Events

Big Data Analytics: Scalable machine learning using open-source tools

March 4, 2014, San Francisco, CA

With the explosion of big data, the need for fast and inexpensive analytics solutions has become a key basis of competition in many industries. Extracting the value of big data with analytics can be complex, and requires advanced skills. During this talk at Pivotal Labs’ San Francisco office, Senior Developer Rahul Iyer will review numerous open source solutions, including MADlib, PivotalR, and PyMadlib.

Truth Will Set You Free but Data Will Piss You Off

March 8, 2014, Austin, TX

This SXSW Interactive panel will examine the political, social, and economic biases that can skew the collection and analysis of data, and the ethical use of data visualization as a communication tool. The panel features Jake Porway, founder of DataKind, which has partnered with Pivotal’s Data Science Labs team for the Pivotal for Good program.

Security and Privacy Considerations for the Big Data Lake with Robert Geiger

March 18, 2014, San Francisco, CA

Pivotal’s Robert Geiger details the considerations and challenges around preserving security and privacy within the data lake during this talk at the Pivotal Labs San Francisco office. During his talk, Geiger will review the issues and some of the technologies being developed within the community and by vendors to secure and manage the data in the emerging data lake.

GigaOm Structure Data 2014

March 19–20, 2014, New York, NY

The world’s biggest and most innovative companies are using data to make better products, build bigger profits and even change the world. Join 900+ big data practitioners, technologists and executives as they examine how big data can drive business success. From grand new uses to the nuts and bolts of capturing, storing, analyzing and serving it, get the bottom line on big data now.

About the Author

Biography

Rails to iOS: What the *&@#^ are these symbols in my code?

For many developers with a background in Ruby or Python (or other similarly human-readable languages), the ...

Wrapping libyaml in go

We recently released version 6.0.0 of cf, the command line client for Cloud Foundry. cf was previously writ...

This Month in Data Science

Top Data Science News in February 2014

Big data: What to trust – data science or the boss’s sixth sense?

Kicking Off A Data Science Project: 4 To-Dos

Data Made Beautiful: Weather, Climate and Fracking Water

Analytics Power Welfare Change in New Zealand

Google Offers Free “Making Sense of Data” Online Course

How To Choose The Right Test Options When Evaluating Machine Learning Algorithms

This Month in Pivotal Data Science

Creating the Digital Brain

Data Driving The Future of Cars: Data Science Innovations in the Automotive Industry

17 New Big Data Things To Talk About Since Strata 2013

Upcoming Data Science Events

Big Data Analytics: Scalable machine learning using open-source tools

Truth Will Set You Free but Data Will Piss You Off

Security and Privacy Considerations for the Big Data Lake with Robert Geiger

GigaOm Structure Data 2014

About the Author

Previous

Next

This Month in Data Science

Top Data Science News in February 2014

This Month in Pivotal Data Science

Upcoming Data Science Events

About the Author

Previous

Next

Related content in this Stream

VMware Tanzu announces the General Availability of AWS Commitment Discount Recommendations, which provides recommendations for all reservable services in AWS through VMware Tanzu CloudHealth.

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.

If you're into FinOps, you've probably heard of FOCUS. Introducing our FOCUS FlexReports template for AWS, Azure, and GCP. Turn your cloud bills into FOCUS-compliant reports in minutes!

The latest Spring Boot simplifies infrastructure setup with Docker Compose. Now, supporting Bitnami images, it opens new possibilities for developers. Exciting times ahead!

Shape the future of Spring! Participate in the State of Spring Survey 2024. Share insights, collaborate with the community, and drive innovation.

Extend Apache Tomcat support with Tanzu Spring Runtime. Seamless transition, enhanced security, and uninterrupted workflow for Java applications.

Welcome to another edition of What’s new with Tanzu Application Catalog. This is a quarterly round up of all things related to Tanzu Application Catalog.

As we stand at the threshold of a new era in data management, Greenplum continues to lead the industry with its commitment to innovation.