Cluster Sizing & Tuning

/tldr: Get 2–5× cost reduction with 10 minutes of math

Executors Cores Memory Autoscaling

-->

The Golden Sizing Rule (2025)

5 cores + 20–30 GB RAM per executor
→ works 95% of the time

Why? HDFS block = 128–256 MB → 5 concurrent tasks → perfect GC + no OOM

Executor Sizing Cheat Sheet

Node Type	vCPU		RAM	Executors/Node	Cores/Executor
r6i.4xlarge / r7g.4xlarge	16	128 GB	3	5	38 GB
r6i.8xlarge	32	256 GB	6	5	40 GB
i4i.16xlarge (storage-optimized)	64	512 GB	12	5	40 GB

Memory Breakdown per Executor

Execution Memory – shuffle, joins, aggregations (spill to 70% of heap)

Storage Memory – caching/persist (shared with execution)

User Memory – UDFs, your objects (~30% of heap)

Total Heap = 75% of container RAM


                        --executor-memory = round_down(0.75 × container_RAM)

Battle-Tested Configs (2024–2025)

Databricks / EMR (Recommended)

spark.executor.cores            5
spark.executor.memory           19g      # for 32GB node → 3 executors
spark.executor.memoryOverhead   4g
spark.driver.cores              5
spark.driver.memory             10g
spark.dynamicAllocation.enabled true
spark.shuffle.service.enabled   true

Kubernetes / Self-managed

--executor-cores 5
--executor-memory 30g
--conf spark.executor.memoryOverhead=6g
--conf spark.kubernetes.memoryOverheadFactor=0.2
--conf spark.dynamicAllocation.enabled=true
--conf spark.shuffle.service.enabled=true

Autoscaling That Actually Works

Min executors: 2–5 (keep warm)

Max executors: data_size_in_TB × 15–25

Scale-up threshold: 75% executor busy

Scale-down after 5–10 min idle

Enable spark.dynamicAllocation.shuffleTracking.enabled=true

Avoid These (Cost You Millions)

1 core per executor → insane GC pressure
>8 cores per executor → HDFS throughput drops
No memoryOverhead → OOM on yarn kills
Static clusters for bursty workloads
Forgetting spark.shuffle.service.enabled=true with dynamic allocation

5 cores • 20–40 GB RAM • dynamic allocation + shuffle service = happy cluster