Hi Everyone,
Rachel & I are excited to announce the early release of the first four chapters of "High Performance Spark" just in time for Strata San Jose. Our first chunk includes:
- Introduction to High Performance Spark
- How Spark Works
- DataFrames, Datasets & Spark SQL
- Joins (SQL & Core)
You can buy it now from O'Reilly :)
Our planned future chapters are*:
- Effective Transformations
- Working with Key/Value data
- Data Structure Design
- Spark Tuning and Cluster Sizing
- Getting Outside of the JVM
- Debugging Techniques
- Spark Components
We'd love your feedback to high-performance-spark@googlegroups.com so we can make a super awesome finished book for you all. If you are going to Strata San Jose next week, I'll also be giving a talk on testing Spark & hosting office hours I'd love to see some of you there.
Cheers,
Holden & Rachel
P.S.
If your up in Seattle area Rachel & I are coming up for Data Day Seattle - hope to see some of you there too!
For our friends across the pond I'll also be speaking at Strata London & hopefully we will have an update with a few more chapters by then (but we might also need to take a quick break from writing to do our day jobs. Don't tell our editors :p).
*Subject to change depending on feedback and how much fun we have writing each chapter
|