New Book Review: "Designing Great Data Products"
New book review for Designing Great Data Products: Inside the Drivetrain Approach, a Four-step Process for Building Data Products, by Jeremy Howard, Margit Zwemer, and Mike Loukides, O'Reilly, 2012, reposted here:
This succint offering from the recently published O'Reilly Strata series of texts provides a departure from the first two Big Data focused series entries that I read, entitled "Big Data Now: Current Perspectives from O'Reilly Radar" and "Planning for Big Data: A CIO's Handbook to the Changing Data Landscape", because this more generalized text looks at the design of data products. The discussion that the authors provide here revolves around what is deemed the "Drivetrain Approach" process that transformed the insurance industry, and the authors walk readers through how this process can be applied effectively in other industries. "We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes."
"For an insurance company, policy price is the product, so an optimal pricing model is to them what the assembly line is to automobile manufacturing. Insurers have centuries of experience in prediction, but as recently as 10 years ago, the insurance companies often failed to make optimal business decisions about what price to charge each new customer. Their actuaries could build models to predict a customer's likelihood of being in an accident and the expected value of claims. But those models did not solve the pricing problem, so the insurance companies would set a price based on a combination of guesswork and market studies." A company called Optimal Decisions Group (ODG) approached this problem with a practical take on step 4 that can be applied to a wide range of problems.
The four steps of the Drivetrain Approach can be summarized as follows: (1) specify the goal, (2) specify the system inputs that can be controlled, (3) determine what new data is needed to reach the goal, and (4) create predictive models following the first three steps. After discussing applications to search engines and the insurance space, the authors discuss application to recommendation engines, followed by high level introductions to related topics optimizing lifetime customer value and best practices from physical data products. While potential readers of this brief white paper sized book should not expect to become experts at designing great data products as a result of reading what the authors have to share here, many new to this space are likely to find value as they begin their journey to systematically use data more effectively.