New Book Review: "Designing Data-Intensive Applications"
New book review for Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems, by Martin Kleppmann, O'Reilly, 2017:


Kleppmann mentioned during his "Turning the Database Inside Out with Apache Samza" talk at Strange Loop 2014 (see my notes) that he was on sabbatical working on this book, and while waiting quite some time for it to be published, I ended up experimenting with his Bottled Water project as well as Apache Kafka (which was only at release 0.8.2.2 at that point in time). Other reviewers are correct that much of the material included in this book is available elsewhere, but this book is packaged well (although still at 550-pages and heavyweight), with most of the key topics associated with data-intensive applications under one roof with good explanations and numerous footnotes which point to resources providing additional detail.
Content is broken down into 3 sections and 12 chapters: (1) foundations of data systems, which covers reliable, scalable, and maintainable applications, data models and query languages, storage and retrieval, and encoding and evolution, (2) distributed data, which covers replication, partitioning, transactions, the trouble with distributed systems, and consistency and consensus, and (3) derived data, which covers batch processing, stream processing, and the future of data systems. The latter 6 chapters are weighted more heavily, with chapter 9 on consistency and consensus, and chapter 12 on the future of data systems, the most lengthy with each comprising about 12% of the book.