Data Warehousing with Greenplum, Second Edition
Data professionals confront a dizzying array of options, as new data sources and types drive an explosion in data processing and management choices.. But one thing remains the same: enterprises still prefer SQL to query and manipulate data. How can SQL-based systems evolve to meet the scale and diversity of modern data? This updated edition teaches you best practices for Greenplum Database, the open source massively parallel processing (MPP) database for analyzing integrated relational and non-relational data at enterprise scale.
Marshall Presser, most recently Field CTO at Pivotal, introduces Greenplum’s approach to data analytics and data-driven decisions, beginning with its shared-nothing architecture. IT managers, developers, data analysts, system architects, and data scientists will all gain from exploring data organization and storage, data loading, running queries, and learning to perform analytics in the database. Discover how Greenplum will help you go beyond the traditional data warehouse.
This book covers:
- Greenplum features, use case examples, and techniques for optimizing use
- Four Greenplum deployment options to help you balance security, cost, and time to usability
- Additional tools for monitoring, managing, securing, and optimizing query responses in the Pivotal Greenplum commercial database
- In-database analytics using SQL and parallel R functions
- Federated queries using external data