The Core Idea
In Databricks, compute is fully separated from storage. That means you always choose your compute — and the exam will test whether you choose correctly for a given scenario.
The decision comes down to four things: who is using it, what the workload is, what performance is needed, and what the governance constraints are.
The Three Compute Types
Clusters
Spark compute — driver + workers. For engineering, notebooks, and ML.
"Spark processing"SQL Warehouses
Optimised for SQL queries and BI tools. For analysts and dashboards.
"BI / SQL analytics"Serverless
Fully managed infra. For quick queries and ad-hoc work without setup.
"no infra management"1. Clusters (All-Purpose & Job)
Clusters are Spark compute environments — a driver node plus workers. They're the backbone of data engineering and ML work.
All-purpose clusters are interactive, designed for notebooks and development. Job clusters spin up for a single job, then terminate automatically — cheaper and the right choice for production pipelines.
2. SQL Warehouses (Databricks SQL)
SQL Warehouses are optimised purely for SQL queries and BI tools. They come in three flavours: Classic, Pro, and Serverless SQL Warehouse. The underlying compute is still Spark, but the interface and optimisation is all about query performance.
3. Serverless Compute
Serverless means Databricks manages the infrastructure entirely — no cluster config, no warm-up decisions. It appears as both a Serverless SQL Warehouse and (in newer platform versions) as Serverless Jobs.
A common real-world pattern: serverless fails in an enterprise environment due to private networking or Unity Catalog storage restrictions — but switching to a standard cluster with Unity Catalog resolves the issue immediately. If you see a scenario on the exam where serverless is ruled out for "security or networking reasons", the answer is Cluster.
Decision Framework
When you see a scenario on the exam, run it through this table:
| Scenario | Best Compute |
|---|---|
| PySpark ETL pipeline | Cluster |
| Scheduled pipeline (Jobs / ADF) | Job Cluster |
| Power BI or Tableau dashboard | SQL Warehouse |
| Business analyst running SQL queries | SQL Warehouse |
| Quick ad-hoc queries, no infra management | Serverless SQL Warehouse |
| Secure enterprise pipeline with Unity Catalog | Cluster |
| ML model training | Cluster |
| Gold layer reporting for CFO dashboard | SQL Warehouse |
Exam-Style Practice Questions
Select an answer — green means correct, red means wrong.
Common Exam Traps
These are the mistakes the exam is designed to catch:
- Using a SQL Warehouse for ETL — it is optimised for queries, not transformations or Spark jobs.
- Using a Cluster for BI dashboards — adds unnecessary overhead and misses the purpose of SQL Warehouses.
- Assuming serverless works in all enterprise setups — private networking and storage restrictions can block it.
- Forgetting that serverless can cost more for sustained heavy workloads — it is not always the cheapest option.
- Confusing All-purpose and Job clusters — if the scenario says "scheduled" or "automated", prefer a Job Cluster.