🌟

Featured Beginner Path

A quick onboarding flow for new learners — start here to build your foundation.

🔥

Introduction to Apache Spark

History, ecosystem, core features, and advantages over Hadoop MapReduce

34 Questions Beginner

Start Quiz →

🧩

Spark Core & RDD Fundamentals

RDD basics, transformations, actions, partitions, persistence & resilience

55 Questions Beginner – Intermediate

Start Quiz →

🧮

DataFrames & Datasets

Schema, logical/physical plans, encoders, and the essentials of lazy evaluation

45 Questions Beginner – Intermediate

Start Quiz →

📊

Spark SQL and Catalyst Optimizer

Intro to Spark SQL, DataFrame queries, Catalyst optimizers fundamentals

35 Questions Beginner

Start Quiz →

🧩

Core & RDD

Deep dive into RDD internals, partitioning, lineage, and advanced transformations.

🧩

Spark RDD — Advanced Concepts

Custom partitioning, accumulators, broadcast variables, and failure recovery mechanisms

45 Questions Intermediate – Advanced

Start Quiz →

🏗️

Spark Architecture

Execution internals: driver, executors, DAG, shuffle, and Tungsten engine.

🏗️

Spark Architecture Basics

Driver, Executors, Cluster Manager, Application, Job, Stage, Task & the DAG

40 Questions Beginner – Intermediate

Start Quiz →

🏗️

Advanced Spark Architecture

Executor memory model, shuffle internals, Tungsten, and task serialization

40 Questions Advanced

Start Quiz →

🚀

Deployment & Cluster

Deployment modes, cluster managers, spark-submit, and best practices.

🚀

Spark Application & Deployment

spark-submit, client vs cluster mode, job stages, YARN/K8s/Standalone

30 Questions Beginner – Intermediate

Start Quiz →

☸️

Spark on Kubernetes

Native K8s mode, dynamic allocation, shuffle service, pod templates

25 Questions Intermediate

Start Quiz →

🧮

DataFrames & Data APIs

Schema, encoders, DataFrame operations, and best practices for analytics.

🧮

DataFrames & Datasets

Schema, logical/physical plans, encoders, lazy evaluation for performance

45 Questions Intermediate

Start Quiz →

🧮

Advanced DataFrame Transformations

Window functions, complex aggregations, UDFs vs built-ins performance

35 Questions Advanced

Start Quiz →

📊

Spark SQL

SQL, Catalyst, query plans, predicate pushdown and join strategies.

📊

Spark SQL & Catalyst Optimizer

SQL functions, logical/physical plans, projection, and predicate pushdown

40 Questions Intermediate – Advanced

Start Quiz →

📊

SQL Query Tuning & Join Strategies

Broadcast joins, shuffle joins, partition pruning & Adaptive Query Execution (AQE)

35 Questions Advanced

Start Quiz →

🌊

Streaming

DStreams and Structured Streaming — real-time processing essentials.

🌊

Spark Streaming (DStreams)

Batch intervals, window operations, checkpointing, and receivers

40 Questions Beginner – Intermediate

Start Quiz →

🌊

Structured Streaming

Event-time, watermarks, triggers, stateful processing, and output modes

30 Questions Advanced

Start Quiz →

🕸️

Graph Processing (GraphX)

GraphX fundamentals for graph algorithms and message passing.

🕸️

GraphX Fundamentals

Pregel API, EdgeTriplet, aggregateMessages, and connected components

30 Questions Beginner – Intermediate

Start Quiz →

🤖

Machine Learning (MLlib)

ML pipelines, transformers, model selection, and scalable algorithms.

🤖

MLlib – ML Pipelines

Transformers, Estimators, Pipeline, CrossValidator, and Evaluators

30 Questions Beginner – Intermediate

Start Quiz →

💾

Delta Lake & Lakehouse

ACID tables, MERGE, time travel, schema evolution and best practices.

💾

Delta Lake & Lakehouse

ACID transactions, MERGE, time travel, schema evolution, and OPTIMIZE

40 Questions Intermediate – Advanced

Start Quiz →

⚙️

Optimization & Tuning

Partitions, caching, AQE, memory tuning, skew handling and shuffle strategies.

⚙️

Spark Configs & Performance

Partition tuning, broadcast, caching, Adaptive Query Execution (AQE), skew handling

35 Questions Intermediate

Start Quiz →

⚙️

Spark Shuffle & Tungsten Engine

Hash vs Sort shuffle, whole-stage code generation, off-heap memory management

25 Questions Intermediate

Start Quiz →

Apache Spark — Quiz Hub

Featured Beginner Path

Introduction to Apache Spark

Spark Core & RDD Fundamentals

DataFrames & Datasets

Spark SQL and Catalyst Optimizer

Core & RDD

Spark RDD — Advanced Concepts

Spark Architecture

Spark Architecture Basics

Advanced Spark Architecture

Deployment & Cluster

Spark Application & Deployment

Spark on Kubernetes

DataFrames & Data APIs

DataFrames & Datasets

Advanced DataFrame Transformations

Spark SQL

Spark SQL & Catalyst Optimizer

SQL Query Tuning & Join Strategies

Streaming

Spark Streaming (DStreams)

Structured Streaming

Graph Processing (GraphX)

GraphX Fundamentals

Machine Learning (MLlib)

MLlib – ML Pipelines

Delta Lake & Lakehouse

Delta Lake & Lakehouse

Optimization & Tuning

Spark Configs & Performance

Spark Shuffle & Tungsten Engine

Resources

Company

Socials