Determine the capabilities of the Notebooks functionality

The Core Idea

Databricks Notebooks are the primary interactive environment for data engineers, scientists, and analysts working on the platform. They're far more capable than a standard Jupyter notebook — with built-in multi-language support, real-time collaboration, versioning, and a rich set of commands that control both the notebook and the cluster.

The exam tests whether you know what notebooks can do natively, which commands trigger which behaviours, and where notebooks fit (and don't fit) in a production workflow.

Exam Mindset

Think of notebooks as having four capability layers: magic commands (language switching and utilities), widgets (parameterisation), versioning (history and Git), and collaboration (real-time co-editing). Questions will test all four.

Magic Commands

Magic commands are prefixed with % and override the default language of the notebook cell. They are notebook-only — they cannot be used in Python scripts or via Databricks Connect.

Language Magic Commands

Every notebook has a default language set at creation time. Magic commands let you switch language per cell:

🐍

%python

Run a cell as Python, even if the notebook default is SQL or Scala.

PySpark / pandas

🗄️

%sql

Run a cell as SQL. Results display as an interactive table automatically.

Delta / SQL queries

⚡

%scala

Run a cell as Scala. Useful when working with Scala-native libraries.

Scala Spark

☕

%r

Run a cell as R. Used for statistical analysis alongside Spark workloads.

R / SparkR

Utility Magic Commands

Beyond language switching, these utility commands are just as important for the exam:

Command	What it does	Key exam point
%md	Renders the cell as Markdown — headings, bold, links, images	Documentation only — no code runs
%sh	Runs shell commands on the driver node only	Does NOT run on worker nodes
%fs	Shorthand for `dbutils.fs` — browse DBFS, copy, move, delete files	Equivalent to `dbutils.fs.ls()`
%run	Executes another notebook inline — shares the same session/scope	Variables from the called notebook are available in the caller
%pip	Installs Python packages at the notebook session level	Restarts Python interpreter after install
%lsmagic	Lists all available magic commands	Useful to know exists

Exam trap — %sh scope

%sh runs on the driver node only. It does not distribute across workers. If an exam question mentions running shell commands across all nodes, %sh is not the answer.

Exam trap — %run scope sharing

When you use %run to call another notebook, all variables, functions, and imports defined in that notebook become available in the current notebook's scope. This is a common exam question.

Widgets

Widgets let you add interactive input controls (dropdowns, text boxes, sliders) to a notebook, making it parameterisable without changing the code. They're the primary way to pass parameters into a notebook when it's called from a Job or via %run.

      Python — creating widgets
      # Text input widget

      dbutils.widgets.text("env", "dev", "Environment")

      # Dropdown widget

      dbutils.widgets.dropdown("region", "UK", ["UK", "US", "EU"], "Region")

      # Read a widget value

      env = dbutils.widgets.get("env")

      # Remove all widgets

      dbutils.widgets.removeAll()

Widget Types

Widget	Use case
`text`	Free-text input — environment name, file path, date string
`dropdown`	Select from a fixed list of options
`combobox`	Dropdown with a free-text fallback option
`multiselect`	Select multiple values from a list

Key exam point — widgets in Jobs

When a notebook is run as part of a Databricks Job, widget default values are overridden by the job parameters passed in at runtime. This is how notebooks are parameterised in production pipelines.

Revision History and Version Control

Databricks Notebooks have two separate versioning systems — built-in revision history and Git integration. The exam may test the difference.

Built-in Revision History

Every save creates a revision snapshot automatically. You can browse, restore, or compare any previous version directly from the notebook UI — no Git required.

Key capabilities

View diffs between versions, restore to any previous state, and add comments to revisions. Revisions are stored per notebook and are separate from any Git history.

Git Integration

Notebooks can be linked to a Git provider (GitHub, GitLab, Azure DevOps, Bitbucket). Once linked, you can commit, push, pull, branch, and merge directly from the Databricks UI — or via the Repos feature.

	Revision History	Git Integration
Setup required	None — always on	Git provider + Repos
Scope	Single notebook	Full repository
Collaboration	View only	Branch, PR, merge
CI/CD	No	Yes
Best for	Quick rollback during development	Production deployment workflows

Multi-Language Notebooks

A single Databricks Notebook can contain cells written in Python, SQL, Scala, and R simultaneously. Each cell runs in the same Spark session, which means data created in one language is accessible in another via temporary views.

      Python cell — create a temp view
      df = spark.read.table("catalog.schema.sales")

      df.createOrReplaceTempView("sales_view")
    

      SQL cell (same notebook) — query the view
      %sql

      SELECT region, SUM(revenue) AS total

      FROM sales_view

      GROUP BY region

Exam key point — sharing data between languages

The correct way to pass data between language cells in a notebook is via a temporary view. Create it in Python with createOrReplaceTempView(), then query it with %sql. This is a direct exam question.

Real-Time Collaboration

Multiple users can edit the same notebook simultaneously — similar to Google Docs. Co-presence cursors show who is editing which cell. Comments can be added to individual cells, and notifications can be sent to collaborators.

🏗️ Real-world note

In practice, notebooks are great for exploration and prototyping but production pipelines should move to modular Python files (via Databricks Connect or Repos) for proper version control, testing, and CI/CD. The exam recognises both use cases — notebooks for interactive work, Jobs + Git for production.

Exam-Style Practice Questions

Select an answer — green means correct, red means wrong.

Q1 A data engineer runs %sh ls /tmp in a Databricks notebook. Where does this command execute?

Q2 A Python notebook cell creates a DataFrame and calls createOrReplaceTempView("my_view"). How can the next cell query this data using SQL?

Q3 Notebook A uses %run ./NotebookB. A variable result is defined in Notebook B. What happens?

Q4 A notebook is triggered as a Databricks Job with a parameter env=prod. The notebook has a widget with default value dev. What value does dbutils.widgets.get("env") return?

Q5 What is the key difference between Databricks Notebook revision history and Git integration?

Q6 Which magic command is the shorthand equivalent of dbutils.fs.ls()?

Common Exam Traps

These are the mistakes the exam is designed to catch:

%sh runs on the driver only — not distributed across workers.
%run shares scope — variables and functions from the called notebook are available in the caller.
Widget defaults are overridden by Job parameters at runtime — the default is just a fallback for interactive use.
Revision history and Git are separate systems — revision history needs no setup, Git requires Repos configuration.
Magic commands cannot be used in Python scripts or Databricks Connect — they are notebook-only.
Data is shared between language cells via temporary views, not by passing DataFrames directly.

⚡ Quick Memory Trick

%sh = Shell on Driver. %fs = File System. %run = Run and Share Scope. %md = Markdown Docs. Four commands, four distinct jobs — know them all.