In the month of August, as students prepare to return to school, big data’s increasing impact upon the college admissions process came to light. Data science’s role in transforming the economies of developing nations was considered, and Twitter’s Sid Patil revealed how his probabilistic approach to Blackjack influenced his data science practice.
Here’s our roundup of the biggest data science news of the month, both from Pivotal and beyond.
College admissions offices have long used a number of statistical methods to determine student admission, including grades, test scores, and demographics. With the influx of big data technologies, admissions officers are taking a more proactive and predictive approach to determine which students will succeed, collecting and leveraging many more data sources than in the past, including social media activity, a student’s level of interest and the probability of their enrollment, and more.
After a number of scientific publishing scandals, and reports of scientists fudging data or results to thrive in a publish or perish academic environment, public trust in researchers has diminished in the past year. But as Christie Aschwanden details at FiveThirtyEight, these are the results of a rigorous scientific method that is much more complex and difficult than it is often given credit for. Through an interactive quiz and detailed journalism, Aschwanden demonstrates that performing rigorous scientific research is an arduous process, but one that remains “the best tool we have.”
With mobile phones becoming ubiquitous and increasingly necessary, their reach has extended into the most far-flung locales within developing nations. As Neil Lawrence of the University of Sheffield explains at the Guardian, this explosion of global connectivity presents many opportunities to improve the lives of impoverished people in Africa. Countries with outdated or subpar infrastructures could see new economic opportunities and ways to improve lives through data science-driven applications that connect to citizens’ mobile phones.
In an essay for Forbes, Teradata’s Sri Raghavan warns that data scientists’ efforts may go to waste if their insights aren’t operationalized. Without taking that important step, the results of data science analysis can be as conceptual and ultimately useless as fancy concept cars which will never see the road. Raghavan draws a distinction between data science and operationalized data science, wherein insights are connected to “business processes and infrastructure that can deliver timely signals and help initiate effective corrective actions.”
Twitter’s head of data science, Sid Patil, provides VentureBeat with a unique look into how he applies insights from his time at the Blackjack table to his data science practice. Patil was a member of the legendary group of MIT students featured in the book Bringing Down the House and the movie 21, who brought probabilistic techniques into the game and saw staggering results. Patil sees many similarities between how Blackjack has evolved and the growth of mobile devices, which he shares in this interview with VentureBeat.
Zillow has become a must-watch site for homeowners and house hunters alike, though how the service arrives at its “Zestimate” of a house’s value may be confusing for those with only a cursory understanding of the process. Datanami profiles Zillow and in particular how the service determines a house’s Zestimate, a figure which is updated three times per week. The Zestimates, which are accurate within 10 percent of prices according to the company, are the result of sophisticated machine learning techniques which take into account as many as 103 attributes per property.
This Month in Pivotal Data Science
Data science is actively being used to save lives and improve healthcare. Working on the forefront of much of this bioinformatics technology revolution, Pivotal’s Sarah Aerni explains what these bioinformatics systems and medical sensors look like, including how MPP databases, Apache Hadoop and Apache Spark fit, and how real-time data is used and where the possibilities are for the future of improving lives.
Pivotal GemFire, a high-performance, in memory database and event-oriented persistence layer that supports strong data consistency, is now available on-demand for Pivotal Cloud Foundry. GemFire for Pivotal Cloud Foundry delivers one of the market’s most powerful in memory technologies on Pivotal’s open cloud native application platform.
Since 2012, Pivotal has teamed up with Girls Who Code to encourage young women to pursue computer-related degrees and careers. In today’s world, too many young women are forgoing a career in computer science even though they showed high interest in science, technology, engineering, and math subjects.
This month, the big news for app developers and architects spans across open source, digital transformation, big data, in-memory data platforms, machine learning, data science, programming language popularity, developer salaries, the world of Cloud Native, and how hackers killed a car on the highway. There is so much good stuff in here, starting off with a roundup of OSCON 2015—the predominant open source conference.
Upcoming Pivotal Events
- VMworld 2015 US: Pivotal Booth/Presentations: Aug 30 – Sep 3, 2015
- Very Large Data Bases: Aug 31 – Sep 4, 2015
- Pivotal Open Source Hub: PA: Enabling R for big data using open source tools PL/R and PivotalR – Thursday September 3, 2015 7PM to 8PM
- SpringOne 2GX: Sep 14 – 17, 2015
- Pivotal Big Data Roadshow: Minneapolis: Sep 22, 2015
- Pivotal Big Data Roadshow: Chicago: Sep 24, 2015
- Apache: Big Data Europe: Sep 28-30, 2015
- Strata + Hadoop World NY: Sep 29-Oct 1, 2015
- Pivotal Big Data Roadshow : Phoenix: Oct 6, 2015
- Pivotal Big Data Roadshow : Manila: Oct 14, 2015
- Pivotal Big Data Roadshow : Toronto: Oct 21, 2015
About the Author
BiographyMore Content by Paul M. Davis