Spark Performance Optimization Series: #1. Skew

4.5

(437)

Write Review

$ 26.00

Add to Cart

Qty

In stock

Description

In Spark cluster data is typically read in as 128 MB partitions which ensures even distribution of data. However, as the data is transformed (e.g. aggregated), it is possible to have significantly…

1.5 Years of Spark Knowledge in 8 Tips, by Michael Berk

Spark Performance Tuning: Skewness Part 1, by Wasurat Soontronchai

How to Optimize Spark Applications for Performance using Sparklens

3. A Case Study Of Spark Performance Optimization On Large Dataframes, by Jiahui Wang

List: Apache Spark, Curated by Luan Moreno M. Maciel

Himansu Sekhar – Medium

Spark Performance Tuning: Skewness Part 1, by Wasurat Soontronchai

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark 1, Karau, Holden, Warren, Rachel, eBook

List: DataEng, Curated by Bruno Servilha

Spark Performance Optimization Series: #1. Skew, by Himansu Sekhar, road to data engineering