This Month In Data Science: October 2015

October 30, 2015 Paul M. Davis

The promise and risks of big data analysis received plenty of attention in recent weeks. While data science’s ability to optimize performance and facilitate more effective operations was seen in the oil and gas, shipping, and even restaurant industries, the risk that “algorithmic bias” will enforce existing social inequities received serious scrutiny.

Here’s our roundup of the biggest data science news of the month, both from Pivotal and beyond.

Oil Companies Use Big Data As Oil Prices Plummet

As oil prices continue to fall, from over $100 a barrel a year ago to $38 in September, companies are increasingly adopting big data tools to manage their operations more cost-effectively and efficiently.

Why Big Data Is A ‘How’ At UPS, Not A ‘What’

UPS was an early adopter of using data-driven approaches to optimize its delivery chain, and that influence has since filtered into the company’s DNA. This feature at Datanami takes a look at how the company is shifting from “descriptive” to predictive analytics, and the potential efficiencies and improvements UPS may reap as a result.

How Big Data As Changing Retail And Restaurant Businesses

Independent retailers and restaurateurs have not traditionally been quick to adopt new technologies, if the continued poor quality of restaurant websites is any indication. But a new generation of tech-savvy retailers and restaurant owners are quickly adopting data science techniques to optimize their businesses and increase customer satisfaction.

When Big Data Becomes Bad Data

In this report for Truth-Out, Lauren Kirchner explores the problematic presumptions and unintended consequences of data-driven decision making, and the risk that “algorithmic bias” will continue to enforce, or even bolster, existing race, gender, and class inequities.

Stanford’s New Raw Data Podcast Analyzes Consequences Of Big Data, Cyber-Technologies

Stanford launched a new podcast this month that aims to investigate and discuss some of issues of representation and bias that Kirchner wrote about in her feature. The biweekly podcast, produced Worldview Stanford, will “examine how big data and cyber technologies are changing the relationships between people, technology and social institutions.”

How Big Data Will Help Solve Global Food Problems

Phys.org profiles the startup Agrimetrics, which aims to provide big data services across all aspects of the food chain, so that producers, processers and retailers can better produce and distribute safe and affordable food on a worldwide scale.

Hot Career: The Number Of Data Scientists Has Doubled Over The Last 4 Years

The explosive growth of job opportunities for data scientists will come as no surprise to those already working in big data-related industries, but the extent of that growth has now been confirmed by a study of LinkedIn data, which reveals that the number of employed data scientists has doubled in the past four years. Also fascinating are the numbers of data scientists employed by respective companies and the rate of hiring, with Microsoft and Facebook appearing as standouts.

This Month In Pivotal Data Science

Sequential Pattern Mining Approach For Watering Hole Attack Detection

As malware techniques continue to evolve, it becomes increasingly challenging to detect network security threats, especially Advanced Persistent Threats (APTs) that are orchestrated by sophisticated adversaries. An increasingly common strategy adopted by APT actors to carry out targeted attacks is the watering hole technique. Watering hole attacks target a group of users in an organization by infesting the websites that are most often visited by these users. In this blog post, Anirudh Kondaveeti and Jin Yu discuss the application of sequential pattern mining to detect coordinated network attacks such as watering hole attacks.

2 Reasons Why In-Memory Data Grids Are A Must-Have For Apps At Scale

Pivotal is proud to announce that Pivotal GemFire was cited as a leader in newly published The Forrester Wave™: In-Memory Data Grids, Q3 2015 report from Forrester Research. While we are proud to report that GemFire was cited among the second-highest in the strategy category, this post also explores a strong point that Forrester underscores in the report. That, “AD&D pros should not make the mistake of turning to IMDGs only when performance at scale becomes an issue. It will become an issue sooner or later.” A free download of the report is included in this post.

Multivariate Time Series Forecasting For Virtual Machine Capacity Planning

In this blog, we continue our blog series on multivariate time series to apply this modeling approaches for forecasting virtual machine capacity planning. This technique can be broadly applied to other areas as well such as monitoring industrial equipment or vehicle engines.

A Quick Look At Spring Cloud Data Flow

The pressures for real-time data in applications is picking up at the same rate that applications are gravitating toward modern Cloud Native architectures. Last month at Spring One 2GX, Pivotal announced the release of Spring Cloud Data Flow, which moves many of the capabilities of Spring XD to a Cloud Native architecture. In this episode, host Simon Elisha walks us through the changes and how it fits into Cloud Native application architectures.

Case Study: Using Data Science To Detect Defects In Semiconductors

In this post, Anirudh Kondaveeti, a Principal Data Scientist at Pivotal, provides an in-depth, real-world example of how data science applies to mechanical and materials engineering in the semiconductor manufacturing industry. Step-by-step, he covers de-noising, preprocessing, feature extraction, dimensionality reduction, outlier detection, and clustering to show how yield and profitability are improved.

Upcoming Pivotal Events

About the Author

Biography

Pivotal Cloud Foundry 1.6 Now Available

Looking at the similarities for the most successful software companies, there are three development tenets ...

All Things Pivotal Podcast Episode #19: Welcome Cote

When creating content on an ongoing basis for an international listening audience, variety is key. This wee...

This Month In Data Science: October 2015

Oil Companies Use Big Data As Oil Prices Plummet

Why Big Data Is A ‘How’ At UPS, Not A ‘What’

How Big Data As Changing Retail And Restaurant Businesses

When Big Data Becomes Bad Data

Stanford’s New Raw Data Podcast Analyzes Consequences Of Big Data, Cyber-Technologies

How Big Data Will Help Solve Global Food Problems

Hot Career: The Number Of Data Scientists Has Doubled Over The Last 4 Years

This Month In Pivotal Data Science

Sequential Pattern Mining Approach For Watering Hole Attack Detection

2 Reasons Why In-Memory Data Grids Are A Must-Have For Apps At Scale

Multivariate Time Series Forecasting For Virtual Machine Capacity Planning

A Quick Look At Spring Cloud Data Flow

Case Study: Using Data Science To Detect Defects In Semiconductors

Upcoming Pivotal Events

About the Author

Previous

Next

This Month In Data Science: October 2015

This Month In Pivotal Data Science

Upcoming Pivotal Events

About the Author

Previous

Next

Related content in this Stream

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

How VMware Tanzu CloudHealth helps customers uncover spiraling AWS Extended Support charges.

VMware Tanzu enhances Spring development with simplified operations, accelerated innovation, seamless microservices transition, increased security, and effortless scaling.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

Bitnami-packaged open source software is loved by developers for its ease of use, which enables developers to directly pull a Bitnami package and seamlessly start using it with little effort.

VMware Tanzu announces the General Availability of AWS Commitment Discount Recommendations, which provides recommendations for all reservable services in AWS through VMware Tanzu CloudHealth.

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.