databricks-migrationoleh microsoft
Update Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.
npx skills add https://github.com/microsoft/skills-for-fabric --skill databricks-migrationUpdate Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.
- GitHub Copilot CLI / VS Code: invoke the
check-updatesskill.- Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.
- Skip if the check was already performed earlier in this session.
CRITICAL NOTES
- To find workspace details (including its ID) from a workspace name: list all workspaces, then use JMESPath filtering
- To find item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace, then use JMESPath filtering
dbutils.widgetshas no direct equivalent in Fabric — use notebook parameters (cell tagparameters) ornotebookutils.runtime.contextfor context injectiondbutils.library(runtime library install) has no equivalent — use Fabric Environments for reproducible library management- Unity Catalog uses a 3-level namespace (
catalog.schema.table); Fabric Lakehouse uses 2-level (schema.tablewithin a named Lakehouse)
Databricks → Microsoft Fabric Migration
Prerequisite Knowledge
Read these companion documents before executing migration tasks:
- COMMON-CORE.md — Fabric REST API patterns, authentication, token audiences, item discovery
- COMMON-CLI.md —
az rest,az login, token acquisition, Fabric REST via CLI - SPARK-AUTHORING-CORE.md — Notebook deployment, lakehouse creation, Spark job execution
For notebook and Lakehouse creation, see spark-authoring-cli. For Fabric Warehouse DDL/DML authoring, see sqldw-authoring-cli.
Table of Contents
| Topic | Reference |
|---|---|
| Migration Workload Map | § Migration Workload Map |
Complete dbutils → notebookutils Mapping | dbutils-to-notebookutils.md |
| Unity Catalog → Fabric Lakehouse Schemas | catalog-migration.md |
| Before/After Code Patterns | code-patterns.md |
| Cluster Config → Fabric Spark Pools | § Cluster Config → Fabric Spark Pools |
| Databricks Jobs → Spark Job Definitions | § Databricks Jobs → Spark Job Definitions |
| Delta Sharing → OneLake Shortcuts | § Delta Sharing → OneLake Shortcuts |
| MLflow → Fabric ML Experiments | § MLflow → Fabric ML Experiments |
| Must / Prefer / Avoid | § Must / Prefer / Avoid |
| Authentication & Token Acquisition | COMMON-CORE.md § Authentication |
| Lakehouse Management | SPARK-AUTHORING-CORE.md § Lakehouse Management |
| Notebook Management | SPARK-AUTHORING-CORE.md § Notebook Management |
Migration Workload Map
| Databricks Component | Fabric Target | Notes |
|---|---|---|
| All-purpose cluster (notebooks, REPL) | Fabric Notebook (Starter Pool or Custom Pool) | No persistent cluster — Fabric provisions compute on session start |
| Job cluster (automated jobs) | Spark Job Definition (SJD) | SJD maps one-to-one with Databricks Jobs on job clusters |
| Unity Catalog | Fabric Lakehouse (schema per namespace) | See catalog-migration.md |
| Databricks Repos (Git-backed notebooks) | Fabric Git Integration | Connect workspace to Azure DevOps or GitHub; notebooks are synced |
| Delta Live Tables (DLT) | Fabric Notebooks + Data Pipelines | No DLT equivalent — rewrite DLT datasets as parameterized notebook cells with pipeline orchestration |
| Databricks SQL Warehouses | Fabric Warehouse or Lakehouse SQL Endpoint | SQL warehouse sessions → Warehouse (for write) or SQL Endpoint (for read-only) |
| MLflow Tracking | Fabric ML Experiments | MLflow SDK is supported in Fabric — see § MLflow |
| Delta Sharing | OneLake Shortcuts + Fabric external data sharing | See § Delta Sharing → OneLake Shortcuts |
| Databricks Feature Store | Fabric Feature Store (preview) | Direct conceptual equivalent; APIs differ |
| dbutils (all sub-modules) | notebookutils (most sub-modules) | See dbutils-to-notebookutils.md for full mapping |
dbutils → notebookutils Quick Reference
The complete side-by-side API table is in dbutils-to-notebookutils.md. The key mappings are:
dbutils Call | notebookutils Equivalent | Compatibility Note |
|---|---|---|
dbutils.fs.ls(path) | notebookutils.fs.ls(path) | Direct replacement |
dbutils.fs.cp(src, dest) | notebookutils.fs.cp(src, dest) | Direct replacement |
dbutils.fs.mv(src, dest) | notebookutils.fs.mv(src, dest) | Direct replacement |
dbutils.fs.rm(path, recurse) | notebookutils.fs.rm(path, recurse) | Direct replacement |
dbutils.fs.mkdirs(path) | notebookutils.fs.mkdirs(path) | Direct replacement |
dbutils.fs.put(path, contents) | notebookutils.fs.put(path, contents) | Direct replacement |
dbutils.fs.head(path, maxBytes) | notebookutils.fs.head(path, maxBytes) | Direct replacement |
dbutils.fs.mount(...) | Not available — use OneLake Shortcuts | Fabric uses token-based access; no FUSE mounts |
dbutils.secrets.get(scope, key) | notebookutils.credentials.getSecret(keyVaultUrl, secretName) | Scope → Key Vault URL; key → secret name |
dbutils.notebook.run(path, timeout, args) | notebookutils.notebook.run(name, timeout, args) | path → notebook name (relative to workspace) |
dbutils.notebook.exit(value) | notebookutils.notebook.exit(value) | Direct replacement |
dbutils.widgets.get(name) | See § Widgets Migration | No direct equivalent |
dbutils.library.install(...) | Not available — use Fabric Environments | Runtime library install not supported |
dbutils.data.summarize(df) | display(df.summary()) | Use display() or pandas describe() |
Widgets Migration
dbutils.widgets has no direct equivalent in Fabric. Use these patterns instead:
| Use Case | Fabric Pattern |
|---|---|
| Pass parameter from parent notebook | notebookutils.notebook.run("child", args={"param": "value"}) — read via notebookutils.runtime.context["parameters"] |
| Pipeline-driven parameterization | Mark cell with parameters tag in notebook; pipeline injects values via notebook activity |
| Interactive selection in notebook | Use display() with input cells or Fabric Data Activator |
| Read pipeline-injected value in code | import notebookutils; params = notebookutils.runtime.context.get("parameters", {}) |
Cluster Config → Fabric Spark Pools
| Databricks Cluster Concept | Fabric Spark Equivalent | Notes |
|---|---|---|
| All-purpose cluster (interactive) | Starter Pool | Auto-provisioned; no config; ideal for notebooks |
| Job cluster (single-use for jobs) | Custom Pool (or Starter Pool) attached to SJD | Configure node size, autoscale in Fabric capacity settings |
Node type (e.g., Standard_DS3_v2) | Fabric node size (Small/Medium/Large/X-Large/XX-Large) | Map by vCore/memory ratio |
| Autoscale min/max workers | Custom Pool min/max node settings | Available in workspace Spark settings |
spark.conf in cluster settings | Fabric Environment Spark properties | Move to Environment item; attach to workspace or notebook |
init_scripts (cluster init) | Fabric Environment install script | Not fully equivalent — only library installs are supported |
| Databricks Runtime version | Fabric Runtime (1.1 = Spark 3.3, 1.2 = Spark 3.4, 1.3 = Spark 3.5) | Choose matching Spark version; test deprecated APIs |
| Photon accelerator | Fabric Native Execution Engine (NEE) | Enable in workspace Spark settings; vectorized execution similar to Photon |
Databricks Jobs → Spark Job Definitions
| Databricks Jobs Concept | Fabric SJD Equivalent | Notes |
|---|---|---|
| Job with single notebook task | SJD referencing a notebook | Attach a default Lakehouse; pass parameters via SJD args |
| Multi-task job (DAG of tasks) | Fabric Data Pipeline orchestrating multiple SJDs/notebooks | Pipeline activities map to job tasks; dependencies = activity dependencies |
| Job schedule (cron) | Pipeline schedule trigger | Cron expression → recurrence trigger in pipeline |
| Job parameters | SJD default arguments or notebook cell parameters | Parameters cell in notebook is injected at runtime |
| Job clusters per task | Pool attached to SJD | Each SJD can specify its Spark pool independently |
| Databricks Workflows | Fabric Data Pipelines | Full DAG orchestration with conditions, loops, and failure branches |
Delegate to
spark-authoring-clifor SJD creation and notebook deployment.
Delta Sharing → OneLake Shortcuts
| Databricks Delta Sharing Pattern | Fabric Equivalent |
|---|---|
| Provider publishes a Delta share | Fabric external data sharing (preview) or OneLake Shortcut to ADLS Gen2 where Delta data resides |
| Recipient reads shared data | Create a OneLake Shortcut pointing to the ADLS Gen2 Delta table; access via Lakehouse |
| Cross-workspace table sharing within org | OneLake Shortcuts pointing to another workspace's Lakehouse tables — no data copy |
| Cross-tenant sharing | Fabric external data sharing (GA roadmap) — use ADLS Gen2 shortcut as interim |
MLflow → Fabric ML Experiments
Fabric ML Experiments are built on the MLflow SDK — most code is directly portable:
| Databricks MLflow Pattern | Fabric Equivalent | Migration Action |
|---|---|---|
mlflow.set_tracking_uri("databricks") | Remove — Fabric tracking is automatic | Delete this line in Fabric notebooks |
mlflow.set_experiment("/path/exp") | mlflow.set_experiment("experiment_name") | Use name only (not path); Fabric creates the Experiment item |
mlflow.log_metric(...) | mlflow.log_metric(...) — identical | No change |
mlflow.log_artifact(...) | mlflow.log_artifact(...) — identical | No change |
mlflow.autolog() | mlflow.autolog() — identical | No change |
mlflow.register_model(...) | mlflow.register_model(...) — identical | Model Registry is available in Fabric ML |
| Databricks Model Serving | Azure ML Online Endpoints or Fabric Data Activator | No direct Fabric model serving yet — use Azure ML |
Must / Prefer / Avoid
MUST DO
- Replace all
dbutils.*calls using the mapping in dbutils-to-notebookutils.md —dbutilsis not available in Fabric notebooks - Replace
dbutils.fs.mount()with OneLake Shortcuts — Fabric uses token-based identity access; FUSE mounts are not supported - Replace
dbutils.secrets.get(scope, key)withnotebookutils.credentials.getSecret(keyVaultUrl, secretName)— secret scopes map to Azure Key Vault URLs - Redesign widget-based parameter passing using notebook parameter cells (tagged
parameters) ornotebookutils.runtime.context - Replace
dbutils.library.install()with Fabric Environments — runtime library installs are not supported in production workloads - Adapt Unity Catalog 3-level namespaces (
catalog.schema.table) to Fabric 2-level (schema.tablewithin a Lakehouse) — see catalog-migration.md - Map Databricks cluster init scripts to Fabric Environments — cluster-level library installs must move to Environment items
PREFER
- Fabric Native Execution Engine (NEE) as the Photon equivalent — enable in workspace Spark settings for vectorized execution on Delta Lake
- OneLake Shortcuts over data copy for Delta tables that already exist in ADLS Gen2 — point directly without re-ingesting
- Fabric Git Integration as the replacement for Databricks Repos — connect workspace to ADO or GitHub for notebook version control
- Fabric ML Experiments for direct MLflow continuity — tracking code requires minimal changes (remove
set_tracking_uri) - Medallion architecture when restructuring migrated Databricks catalogs — align
bronze,silver,goldUnity Catalog schemas to separate Fabric Lakehouses - Starter Pool for migrating interactive notebook workflows — eliminates cluster startup time that was a common pain point in Databricks job clusters
AVOID
- Do not import
dbutilsor attemptdbutils = ...assignments in Fabric notebooks — this will raiseNameError; always usenotebookutils - Do not assume Unity Catalog governance policies transfer automatically — RBAC, row-level security, and column masking must be reconfigured in Fabric using workspace roles and Lakehouse permissions
- Do not use
%pip installin production Fabric notebooks at runtime — use Fabric Environments for stable, versioned library management - Do not attempt to port Delta Live Tables (DLT) pipelines verbatim — DLT has no Fabric equivalent; rewrite as parameterized notebooks orchestrated by Fabric Pipelines
- Do not rely on Databricks-specific Spark configurations (e.g.,
spark.databricks.*) — these are proprietary and will be silently ignored or raise errors in Fabric - Do not use DBFS paths (
dbfs:/...) — there is no DBFS in Fabric; all paths must use OneLakeabfss://or Lakehouse-relative paths
Examples
See dbutils-to-notebookutils.md and code-patterns.md for the full mapping. Key quick references:
dbutils.fs → notebookutils.fs
# Databricks
dbutils.fs.ls("/mnt/bronze/orders/")
dbutils.fs.cp("/mnt/raw/file.csv", "/mnt/archive/file.csv")
# Fabric (replace DBFS/mount paths with OneLake relative paths)
notebookutils.fs.ls("Files/bronze/orders/")
notebookutils.fs.cp("Files/raw/file.csv", "Files/archive/file.csv")
dbutils.secrets → notebookutils.credentials
# Databricks
pwd = dbutils.secrets.get(scope="prod", key="db-password")
# Fabric (scope → Key Vault URL, key → secret name)
pwd = notebookutils.credentials.getSecret("https://myvault.vault.azure.net/", "db-password")
Unity Catalog namespace → Lakehouse schema
# Databricks
df = spark.read.table("prod.silver.customers")
# Fabric (catalog dropped; Lakehouse context provides it)
df = spark.read.table("silver.customers")