synapse-migration
Vérification des mises à jour — UNE FOIS PAR SESSION (obligatoire) La première fois que cette compétence est utilisée dans une session, exécutez la compétence de vérification des mises à jour avant de continuer.
npx skills add https://github.com/microsoft/skills-for-fabric --skill synapse-migrationUpdate Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.
- GitHub Copilot CLI / VS Code: invoke the
check-updatesskill.- Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.
- Skip if the check was already performed earlier in this session.
CRITICAL NOTES
- To find workspace details (including its ID) from a workspace name: list all workspaces, then use JMESPath filtering
- To find item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace, then use JMESPath filtering
mssparkutilsandnotebookutilsshare the same API surface in most cases — the namespace is the primary change- Linked Services have no direct REST API equivalent in Fabric — they are replaced by Data Connections (for external sources) and OneLake Shortcuts (for storage mounts)
Synapse Analytics → Microsoft Fabric Migration
Prerequisite Knowledge
These companion documents provide general Fabric REST patterns. Do NOT read them upfront — reference only when a specific phase requires a pattern not already covered in this skill's resource files:
- COMMON-CORE.md — General Fabric REST API patterns, authentication & token audiences, item discovery via JMESPath
- COMMON-CLI.md —
az rest/az loginCLI patterns, authentication recipes - SPARK-AUTHORING-CORE.md — Notebook/lakehouse creation (already covered in spark-item-migration.md and lake-database-migration.md)
- SQLDW-AUTHORING-CORE.md — Fabric Warehouse T-SQL (delegate to
sqldw-authoring-cliskill)
Auth, API endpoints, and item payloads are fully documented in this skill's own files. The common docs above are fallback references only.
Table of Contents
| Topic | Reference |
|---|---|
| Migration Orchestrator | migration-orchestrator.md |
| API-Driven Migration Workflow | § API-Driven Migration Workflow |
| Migration Workload Map | § Migration Workload Map |
| Spark Pool → Environment Migration | spark-pool-migration.md |
| Lake Database → Lakehouse Migration | lake-database-migration.md |
| External Hive Metastore → Lakehouse Migration | external-hms-migration.md |
| Notebook & SJD Migration | spark-item-migration.md |
| Library Compatibility (Synapse vs. Fabric RT 1.3) | library-compatibility.md |
| Connector Refactoring (Kusto, Cosmos DB, ADLS OAuth) | connector-refactoring.md |
mssparkutils → notebookutils API Mapping | utility-api-mapping.md |
| Linked Services → Data Connections / Shortcuts | connectivity-migration.md |
| Before/After Code Patterns (incl. Catalog API gaps) | code-patterns.md |
| Migration Report (with Fabric portal links) | migration-report.md |
| Migration Troubleshooting Guide | migration-gotchas.md |
| Validation & Testing | validation-testing.md |
| Security & Governance (Production Readiness) | security-governance.md |
| T-SQL & Spark Configuration Differences | § T-SQL & Spark Configuration Differences |
| Capacity Sizing Reference | § Capacity Sizing Reference |
| Must / Prefer / Avoid | § Must / Prefer / Avoid |
| Feature Parity Reference | § Feature Parity Reference |
| Migration Gotchas — Quick Reference | § Migration Gotchas + migration-gotchas.md |
| Post-Migration: What's Next | § Post-Migration: What's Next |
Context Loading Guide
IMPORTANT — Load only what you need. Do NOT read all resource files upfront. Load the specific file for the phase you are executing:
| When | Read This File | Lines |
|---|---|---|
| User asks to migrate a workspace (full orchestration) | migration-orchestrator.md | ~1264 |
| Phase 0: Spark Pools → Environments | spark-pool-migration.md | ~290 |
| Phase 1: Databases → Lakehouses (built-in HMS) | lake-database-migration.md | ~574 |
| Phase 1: Databases → Lakehouses (external HMS) | external-hms-migration.md | ~388 |
| Phase 2–3: Notebooks & SJDs | spark-item-migration.md | ~326 |
| Code refactoring (mssparkutils, connectors) | utility-api-mapping.md + connector-refactoring.md + code-patterns.md | ~588 |
| Post-migration validation | validation-testing.md | ~487 |
| Troubleshooting failures | migration-gotchas.md | ~225 |
| Production security setup | security-governance.md | ~926 |
| Library version gaps | library-compatibility.md | ~106 |
| Generating migration report | migration-report.md | ~360 |
| Capacity sizing & SKU planning | capacity-sizing.md | ~85 |
| Feature parity matrix | feature-parity.md | ~65 |
API-Driven Migration Workflow
This skill supports programmatic migration of Synapse Spark items via REST APIs (no UI-based Migration Assistant required).
Authentication
| Target | Token Audience |
|---|---|
| Synapse ARM (management plane) | https://management.azure.com |
| Synapse Data Plane | https://dev.azuresynapse.net |
| Fabric REST API | https://api.fabric.microsoft.com |
Use the token-acquisition recipe in COMMON-CLI § Authentication Recipes with the audiences above.
Migration Phases (Execute in Order)
| Phase | Synapse Source | Fabric Target | Resource |
|---|---|---|---|
| Phase 0 | Spark Pool | Environment | spark-pool-migration.md |
| Phase 1 | Lake Database (built-in HMS) | Lakehouse | lake-database-migration.md |
| Phase 1 | External Hive Metastore | Lakehouse | external-hms-migration.md |
| Phase 1b | Ad-hoc abfss:// storage paths | OneLake Shortcuts | migration-orchestrator.md (migrate-and-modernize only) |
| Phase 2 | Notebooks | Notebook | spark-item-migration.md |
| Phase 3 | Spark Job Definitions | SJD | spark-item-migration.md |
| Final | Validation & Testing | — | validation-testing.md |
| Optional | Security & Governance | — | security-governance.md |
Phase order matters: Environments (Phase 0) must exist before notebooks/SJDs can bind to them. Lakehouses (Phase 1) must exist before notebooks can bind to them (Phase 2).
For the full execution flow with sub-steps, decision points, lift-and-shift vs. modernize paths, and error recovery, see migration-orchestrator.md.
REST API Quick Reference
All Synapse and Fabric API endpoints with request/response examples are in migration-orchestrator.md (Steps 2a–2e). Authentication tokens:
| Target | Token Audience |
|---|---|
| Synapse ARM | https://management.azure.com |
| Synapse Data Plane | https://dev.azuresynapse.net |
| Fabric REST API | https://api.fabric.microsoft.com |
API docs: Synapse ARM · Synapse Data Plane · Fabric Items · Fabric Shortcuts · Fabric Connections · Fabric Environments
Migration Workload Map
Use this table to determine the correct Fabric target for each Synapse component:
| Synapse Component | Fabric Target | Notes |
|---|---|---|
| Spark Pool (notebooks, jobs) | Fabric Spark (Lakehouse / Notebooks / SJD) | Starter Pool replaces on-demand pools for most workloads |
| Dedicated SQL Pool | Fabric Warehouse | T-SQL surface area differences apply — see § T-SQL & Spark Configuration Differences. Procedural migration guide not yet available — separate migration track. For T-SQL authoring, delegate to sqldw-authoring-cli. |
| Serverless SQL Pool | Lakehouse SQL Endpoint | Read-only Delta/Parquet queries; no DDL required |
| Synapse Pipelines | Fabric Data Pipelines | Activity types, triggers, and expressions are broadly compatible. Pipeline migration resource not yet available — separate migration track. |
| Synapse Link for Cosmos DB / SQL | Fabric Mirroring | Native mirroring replaces the Synapse Link connector pattern. Not covered by this skill. |
| Linked Services | Data Connections (external) / OneLake Shortcuts (storage) | See connectivity-migration.md |
| Integration Datasets | Fabric Pipeline source/sink config | Dataset definitions are inlined into pipeline activities in Fabric. Not covered by this skill. |
| Managed Virtual Networks | Fabric Managed Private Endpoints | Configure in Fabric capacity settings |
| Synapse Studio | Fabric workspace | All artifact types live in a single workspace with Git integration |
Decision Tree: Which Fabric Spark Workload?
Synapse Spark workload
├── Interactive notebook with data exploration → Fabric Notebook (attached to Lakehouse)
├── Scheduled/production job → Spark Job Definition (SJD)
├── T-SQL over files/Delta → Lakehouse SQL Endpoint (no migration needed — just point to OneLake)
└── Real-time ingest → Fabric Eventstream + Lakehouse
T-SQL & Spark Configuration Differences
For detailed T-SQL surface area gaps (PolyBase → COPY INTO, distribution hints, result set caching) and Spark configuration mappings (pools, %%configure, runtime versions), see feature-parity.md.
Key actions: Remove
DISTRIBUTION = HASH(col)hints, replaceCREATE EXTERNAL TABLEwithCOPY INTO, replacespark.read.synapsesql()with OneLake shortcuts or JDBC. Delegate T-SQL authoring tosqldw-authoring-cli.
Capacity Sizing Reference
For Synapse pool → Fabric SKU mapping tables, sizing decision guide, and cost model comparison, see capacity-sizing.md.
Quick guide: Dev/test = F8–F16 with Starter Pool; standard production = F32–F64; enterprise = F128+. Use Fabric Trial (free F64, 60 days) for migration validation.
Must / Prefer / Avoid
MUST DO
- Replace all
mssparkutilsimports withnotebookutils— see utility-api-mapping.md for the complete namespace table - Replace all Linked Services with Fabric Data Connections (for external databases/services) or OneLake Shortcuts (for ADLS Gen2 / Blob storage mounts) — see connectivity-migration.md
- Replace
spark.read.synapsesql()with Lakehouse shortcut reads or JDBC connections to the Fabric Warehouse SQL endpoint - Re-test all notebooks after migration against the target Fabric Runtime version — Spark minor version differences can surface deprecated API warnings
- Externalize all workspace/item IDs — never hardcode; use pipeline parameters or Variable Libraries
- Replace pool-level library installs with Fabric Environments attached at the workspace or notebook level
PREFER
- OneLake Shortcuts over full data copies — mount existing ADLS Gen2 containers as shortcuts rather than re-ingesting data during migration
- Fabric Starter Pool for dev/test migrations — eliminates pool warm-up wait time inherent in Synapse on-demand pools
- Lakehouse SQL Endpoint as a drop-in for Serverless SQL Pool reads — point existing consumers at the endpoint with minimal query changes
- Medallion architecture for migrated data — align with Bronze/Silver/Gold patterns (see
e2e-medallion-architectureskill) - Incremental migration — migrate and validate workload by workload rather than performing a big-bang cutover
- Parameterized notebooks to allow environment promotion (dev → test → prod) without code changes
AVOID
- Do not copy-paste PolyBase
CREATE EXTERNAL TABLEDDL into Fabric Warehouse — rewrite asCOPY INTOor use Lakehouse for external data access - Do not assume Synapse Linked Service connection strings are reusable — credentials and endpoints must be reconfigured as Fabric Data Connections
- Do not install libraries in notebook cells (
%pip installat runtime) for production workloads — use Fabric Environments for reproducible, versioned library management - Do not migrate Dedicated SQL Pool distribution hints (
HASH,ROUND_ROBIN,REPLICATE) verbatim — remove them; Fabric Warehouse handles distribution automatically - Do not use
wasb://orabfss://[email protected]/paths as primary data paths — migrate data access to OneLakeabfss://[email protected]/paths
Examples
See code-patterns.md for full before/after examples. Key quick references:
mssparkutils.env → notebookutils.runtime
# Synapse
workspace = mssparkutils.env.getWorkspaceName()
# Fabric
workspace = notebookutils.runtime.context["workspaceName"]
Linked Service credential → Key Vault secret
# Synapse
conn = mssparkutils.credentials.getConnectionStringOrCreds("MyLinkedService")
# Fabric
conn = notebookutils.credentials.getSecret("https://myvault.vault.azure.net/", "my-secret")
Dedicated SQL Pool DDL → Fabric Warehouse DDL
-- Synapse (remove distribution hints)
CREATE TABLE dbo.Fact (...) WITH (DISTRIBUTION = HASH(id), CLUSTERED COLUMNSTORE INDEX);
-- Fabric Warehouse
CREATE TABLE dbo.Fact (...);
Feature Parity Reference
Full Synapse → Fabric feature matrix (28 features), T-SQL surface area gaps, and Spark configuration differences are in feature-parity.md.
Key gaps (⚠️/❌):
spark.read.synapsesql()replaced by JDBC/shortcuts · Linked Services redesigned as Data Connections/Shortcuts · External HMS partial (migrate as shortcuts) ·mssparkutils.envrenamed tonotebookutils.runtime· Result set caching ❌ · Workload management ❌ · PolyBase →COPY INTO
Migration Gotchas — Quick Reference
The full troubleshooting guide with code examples and multi-option resolutions is in migration-gotchas.md. This summary surfaces the key issues for quick scanning during migration:
| # | Flag ID | Issue | Severity | Blocks? | Resolution Summary |
|---|---|---|---|---|---|
| G1 | SYNAPSESQL_NO_EQUIVALENT | spark.read.synapsesql() has no Fabric equivalent | High | Yes | Replace with OneLake shortcut read, Warehouse JDBC, or Data Pipeline |
| G2 | LIBRARY_VERSION_CONFLICT | Custom library version conflicts with Fabric Runtime | Medium | Maybe | Pin compatible version in Environment, or find Fabric-native alternative |
| G3 | DELTA_PROTOCOL_MISMATCH | Delta protocol version incompatibility | High | Yes | Rewrite table with matching protocol (delta.minReaderVersion/minWriterVersion) |
| G4 | SECURITY_MODEL_INCOMPATIBLE | Synapse managed identity / IP firewall not portable | Medium | Yes | Reconfigure as Workspace Identity + Fabric Managed Private Endpoints |
| G5 | GPU_POOL_UNSUPPORTED | GPU-accelerated Spark pools not available in Fabric | High | Yes | Migration blocker — keep workload in Synapse or use Azure ML |
| G6 | DOTNET_SPARK_UNSUPPORTED | .NET for Spark (C#/F# SJDs) not supported | High | Yes | Migration blocker — rewrite in PySpark or keep in Synapse |
| G7 | NULLABLE_POOL_REFERENCE | bigDataPool/targetBigDataPool field is null (not missing) — causes NoneType crash | Medium | No | Use (x.get("bigDataPool") or {}).get(...) pattern |
| G8 | SESSION_CONFIG_IGNORED | Some %%configure keys silently ignored in Fabric | Low | No | Remove unsupported keys; use Environment for pool-level config |
| G9 | SHORTCUT_CONNECTION_FAILED | ADLS shortcut creation fails (connection/permission) | High | Partial | Verify connection credential type (Key > WorkspaceIdentity > OAuth2) and RBAC |
Post-Migration: What's Next
After completing Phases 0–3 and validation, hand off to these companion skills for ongoing operations:
Agentic Exploration Workflow
Once data has landed in Fabric Lakehouses, use this sequence to validate and explore:
- Discover → List schemas, tables, and row counts via Lakehouse SQL Endpoint (
sqldw-consumption-cli) - Sample →
SELECT TOP 5on migrated tables to verify data integrity - Validate → Run validation checks from validation-testing.md (V1–V6)
- Explore → Write Spark or T-SQL queries against migrated data using
spark-consumption-cliorsqldw-consumption-cli - Build → Create Gold-layer aggregations with
e2e-medallion-architecture(Bronze → Silver → Gold) - Consume → Build semantic models and reports with
semantic-model-authoring
Companion Skill Cross-References
| Post-Migration Task | Skill | When to Use |
|---|---|---|
| Interactive Lakehouse SQL queries | sqldw-consumption-cli | Exploring migrated data via SQL Endpoint |
| Interactive PySpark exploration | spark-consumption-cli | Ad-hoc Spark queries on migrated Lakehouses |
| Notebook & SJD authoring (new) | spark-authoring-cli | Creating new Spark items post-migration |
| Medallion architecture build-out | e2e-medallion-architecture | Structuring Bronze/Silver/Gold after lift-and-shift |
| Warehouse performance monitoring | sqldw-operations-cli | Diagnosing slow queries on Fabric Warehouse |
| Semantic model creation | semantic-model-authoring | Building Power BI models over migrated data |
| Report consumption & DAX | semantic-model-consumption | Querying existing semantic models |
| KQL analytics | eventhouse-authoring-cli / eventhouse-consumption-cli | If migrating real-time workloads to Eventhouse |
Variable Library for Environment Promotion
After migration, avoid hardcoded workspace/item IDs by centralizing configuration in a Variable Library item:
# Read config from Variable Library — works in notebooks
lib = notebookutils.variableLibrary.getLibrary("MigrationConfig")
lakehouse_name = lib.lakehouse_name
workspace_id = lib.workspace_id
# ❌ WRONG — .get() does not exist
# notebookutils.variableLibrary.get("MigrationConfig", "lakehouse_name")
- Use Value Sets (
valueSets/dev.json,valueSets/prod.json) to promote across environments without code changes - Boolean values are returned as strings — compare with
.lower() == "true", notbool() - In Data Pipelines, reference via
@pipeline().libraryVariables.<name>(not@variables()) - Full Variable Library patterns → see common/notebook-authoring/context-and-params.md § Variable Library