Bringing Data Science and Advanced Analytics to Marketers in Retail and CPG
emnos GmbH is a global retail and consumer packaged goods expert whose on-demand platforms help analyze massive volumes of transactional, customer and other data to gain insights into customer buying behavior. emnos’ clients, who include Walgreens, Proctor & Gamble and Coca-Cola, use these insights to make better, more informed marketing and communications decisions. emnos is based in Munich, Germany with additional offices in London, Paris, Madrid and Chicago and is a subsidiary of the American Express Group.
Understanding customer buying behavior is the key to success for retailers and consumer packaged goods (CPG) companies. In addition to the basics—who bought what, when—this includes understanding the mix of products customers typically buy, how the addition of new products influences sales of other products, and how factors like time of year and weather impact product sales. With these types of insights in hand, retailers and CPG companies are better prepared to make effective pricing, promotion and marketing decisions.
Not all retailers and CPG companies, however, are equipped with the necessary skills and technology to undertake the data science and analytics required to glean these and other insights from massive volumes of transactional, product, and customer data. Even for those that do have the wherewithal, it often makes more sense to work with a partner who specializes in retail analytics so as not to be overwhelmed by the related operational responsibilities of data science at scale. That’s where emnos comes in.
The Munich-headquartered company has been helping both retailers and CPG companies whose products they sell understand customer buying behavior since its founding in 2003. The company’s SaaS-based, analytics-driven platform, called emnos Insight Portal, covers the complete customer behavior analytics lifecycle, from customer segmentation and market basket analysis to promotions analysis. The output of the platform includes self- service reports and dashboards that communicate insights and allow users - category and brand managers at retailers and CPG companies - to make better informed day-to-day and strategic decisions.
“The analytics knowledge is our IP. On top of that we have the knowledge such as application design and development, algorithm development, testing, data management, and operations to offer an end-to-end solution to our customers," said Mark Passauer, operations manager at emnos. emnos also takes advantage of economies of scale when it comes to infrastructure, he added. With emnos handling the analytics heavy lifting —including managing the underlying infrastructure and developing data science models— retailers are able to spend more time consuming insights and running their businesses.
“It is much more efficient for our customers to buy this as a service than to build this up in-house and have to manage it on their own," Passauer said.
The emnos Insight Portal requires the support of a cost-effective, high-performance, scalable and secure analytical database, Passauer said.
From an analytical perspective, emnos requires a database that enables its data scientists to create the classification, regression and predictive models and algorithms that are at the core of emnos’ value proposition. While some types of analysis are fairly straightforward, others are quite complex. For example, the emnos platform helps retailers forecast the sales impact of new products on existing products. In some cases, the "new" product could just be a different form factor of an existing product. Still, it can have a big impact, Passauer said.
Scale is another important criteria for emnos, both from an analytics and cost perspective. With its customers generating more and more data every day—point-of-sale transaction data, loyalty card data, click-through data, etc.—emnos needs a scalable and efficient analytical database that provides low total cost of ownership, according to Peter Breitenberger, who leads product development and operations at emnos. At the same time, storing all that data can get expensive with traditional relational database products.
Finally, performance is paramount. “The short-term availability of reports is becoming more critical for our clients," said Breitenberger. Retailers and CPG companies use the insights provided by the emnos platform to make day-to-day decisions, such as when to double down on a promotion that is driving more revenue than expected or when to cut short promotions that aren’t performing. Reports must be generated in seconds and minutes not days and weeks, or even hours. “We were looking for a system that was able to fulfil our performance requirements," Breitenberger said.
emnos has been using Pivotal Greenplum, a massively parallel processing (MPP), shared nothing data warehouse since 2008—when Greenplum, the company, was still an independent start-up. Pivotal Greenplum is based on open source Greenplum Database. It is a software-based data warehouse, meaning it can be deployed on bare metal commodity hardware, in virtualized or cloud environments, or via a pre-configured appliance. emnos originally deployed Greenplum on commodity hardware, but migrated to a newly released appliance offered by EMC, the Data Computing Appliance (DCA), in 2011. The DCA is a purpose built appliance to run Greenplum software.
Greenplum supports advanced analytics at scale in addition to traditional reporting. Specifically, Greenplum supports PL/R, PL/Python and the open source machine learning library Apache MADlib (incubating), all of which help emnos data scientists develop predictive models. This capability is important to emnos because its core value proposition is uncovering actionable insights hidden in troves of customer data. The ability to use PL/R, PL/ Python and Apache MADlib (incubating) with Greenplum was one of the key decision factors.
Greenplum also meets emnos’ scalability requirements. From a cost perspective, Greenplum scales linearly thanks to its shared nothing architecture and can be deployed on clusters of inexpensive commodity hardware. Analytics can run in-database, meaning algorithms and predictive models are fully parallelized and run across all data stored in a Greenplum cluster. This reduces the chances of missing important insights due to analyzing just samples of data.
“We do OLAP (online analytical processing) and Greenplum is simply more cost-effective as an MPP database for such an analytics-intensive use case," Breitenberger said. “Greenplum’s shared-nothing feature matches our requirements perfectly."
Performance was also a deciding factor. Greenplum includes the Pivotal Query Optimizer (PQO), the industry’s first cost-based query optimizer for Big Data workloads. PQO leverages a multi-core scheduler that distributes individual optimization tasks across multiple cores. This allows the PQO to apply all possible optimizations at the same time, resulting in the fastest analytical processing speeds possible for any given query or workload. This is a prohibitively expensive task for traditional data warehouses.
emnos relies on Pivotal Greenplum on a daily basis to generate the reports for clients that are at the heart of its business. These are mission critical workloads, Breitenberger said, and Greenplum has lived up to expectations.
“Our clients have a huge impact on our IT landscape. One typical large client easily needs one or two Greenplum clusters," Breitenberger said. Greenplum has enabled emnos to scale its platform over time as it adds new clients, and as clients generate ever more data, without impacting performance. “Pivotal Greenplum has been able to handle everything we have thrown at it," Breitenberger said.
emnos data scientists are also beneficiaries of Greenplum. They take advantage of Greenplum’s native advanced analytical capabilities to continually look for new insights that help emnos customers get a leg up on the competition. With both advanced analytics and reporting running on Greenplum, emnos data scientists can easily move new insights into production.
Greenplum is also helping emnos adapt to the new realities of Big Data. Unstructured data is playing an increasingly important role in the types of customer behavior analytics emnos specializes in. In order to process and analyze more unstructured data, emnos deployed Pivotal HDB in 2015. HDB is a Hadoop native SQL database, powered by Apache HAWQ (incubating). It shares a common PostgreSQL heritage with Greenplum, meaning emnos data scientists were able to easily port Greenplum code to HDB and immediately get value from Hadoop.
“Our code is generally very strongly geared towards Greenplum, but it works on HAWQ as well," Breitenberger said. “The combination of Greenplum with Hadoop-based HAWQ databases opens the door to even greater flexibility and new solutions for us. Strategically, this parallel approach is a very positive step for us, as it’s both exciting and forward-looking."
Retailers know data and data science are keys to maintaining a competitive edge in today’s digital economy. With the helps of Pivotal Greenplum, and more recently Pivotal HDB, emnos has been helping retailers and CPG companies harness their data to make better informed, impactful marketing decisions for almost a decade now. And the company has no plans to slow down anytime soon.
“I can’t wait to see which new opportunities this data-intensive model will open up for us and how our clients’ constantly evolving needs will change our IT environment," Breitenberger said. “Our clients’ success will drive our business success—and that includes our database solution."