synapse-migrationbởi microsoft

Update Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.

npx skills add https://github.com/microsoft/skills-for-fabric --skill synapse-migration

Update Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.

  • GitHub Copilot CLI / VS Code: invoke the check-updates skill.
  • Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.
  • Skip if the check was already performed earlier in this session.

CRITICAL NOTES

  1. To find workspace details (including its ID) from a workspace name: list all workspaces, then use JMESPath filtering
  2. To find item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace, then use JMESPath filtering
  3. mssparkutils and notebookutils share the same API surface in most cases — the namespace is the primary change
  4. Linked Services have no direct REST API equivalent in Fabric — they are replaced by Data Connections (for external sources) and OneLake Shortcuts (for storage mounts)

Synapse Analytics → Microsoft Fabric Migration

Prerequisite Knowledge

Read these companion documents before executing migration tasks:

For notebook deployment details, see spark-authoring-cli. For Fabric Warehouse DDL/DML authoring, see sqldw-authoring-cli.


Table of Contents

TopicReference
Migration Workload Map§ Migration Workload Map
mssparkutilsnotebookutils API Mappingutility-api-mapping.md
Linked Services → Data Connections / Shortcutsconnectivity-migration.md
Before/After Code Patternscode-patterns.md
T-SQL Surface Area Gaps§ T-SQL Surface Area Gaps
Spark Configuration Differences§ Spark Configuration Differences
Must / Prefer / Avoid§ Must / Prefer / Avoid
Authentication & Token AcquisitionCOMMON-CORE.md § Authentication
Lakehouse ManagementSPARK-AUTHORING-CORE.md § Lakehouse Management
Notebook ManagementSPARK-AUTHORING-CORE.md § Notebook Management
Fabric Warehouse AuthoringSQLDW-AUTHORING-CORE.md

Migration Workload Map

Use this table to determine the correct Fabric target for each Synapse component:

Synapse ComponentFabric TargetNotes
Spark Pool (notebooks, jobs)Fabric Spark (Lakehouse / Notebooks / SJD)Starter Pool replaces on-demand pools for most workloads
Dedicated SQL PoolFabric WarehouseT-SQL surface area differences apply — see § T-SQL Surface Area Gaps
Serverless SQL PoolLakehouse SQL EndpointRead-only Delta/Parquet queries; no DDL required
Synapse PipelinesFabric Data PipelinesActivity types, triggers, and expressions are broadly compatible
Synapse Link for Cosmos DB / SQLFabric MirroringNative mirroring replaces the Synapse Link connector pattern
Linked ServicesData Connections (external) / OneLake Shortcuts (storage)See connectivity-migration.md
Integration DatasetsFabric Pipeline source/sink configDataset definitions are inlined into pipeline activities in Fabric
Managed Virtual NetworksFabric Managed Private EndpointsConfigure in Fabric capacity settings
Synapse StudioFabric workspaceAll artifact types live in a single workspace with Git integration

Decision Tree: Which Fabric Spark Workload?

Synapse Spark workload
├── Interactive notebook with data exploration → Fabric Notebook (attached to Lakehouse)
├── Scheduled/production job → Spark Job Definition (SJD)
├── T-SQL over files/Delta → Lakehouse SQL Endpoint (no migration needed — just point to OneLake)
└── Real-time ingest → Fabric Eventstream + Lakehouse

T-SQL Surface Area Gaps

Fabric Warehouse supports a broad T-SQL surface, but some Dedicated SQL Pool features differ:

Synapse Dedicated SQL Pool FeatureFabric Warehouse EquivalentAction Required
CREATE EXTERNAL TABLE (PolyBase)COPY INTO or Lakehouse SQL EndpointRewrite ingestion; use COPY INTO for bulk load from ADLS/OneLake
DISTRIBUTION = HASH(col)Not applicable — Fabric auto-distributesRemove distribution hints from DDL
CLUSTERED COLUMNSTORE INDEX (default)Delta Lake (Lakehouse) or Fabric Warehouse DCIWarehouse tables use Delta-backed storage automatically
Result set cachingNot availableRemove cache hints; rely on query plan caching
Workload management (classifiers)Not availableUse workspace capacity management
sp_renameSupportedNo change needed
MERGE statementSupportedNo change needed
Temp tables (#temp)SupportedNo change needed
Window functionsSupportedNo change needed

Delegate to sqldw-authoring-cli for all T-SQL DDL/DML authoring tasks after mapping the workload.


Spark Configuration Differences

Synapse Spark ConceptFabric Spark EquivalentNotes
Spark Pool definition (node type, autoscale min/max)Custom Pool or Starter PoolStarter Pool (auto-provisioned, no config needed) covers most dev workloads; Custom Pools for production SLAs
%%configure magic cell (session-level config)%%configure magic — identical syntaxSupported in Fabric notebooks
spark.conf.set(...)spark.conf.set(...)identicalNo change needed
Environment-scoped libraries (pool packages)Fabric Environment attached to workspace/notebookReplace pool-level library installs with a Fabric Environment item
Synapse-specific Spark versionsFabric Runtime versions (1.1 = Spark 3.3, 1.2 = Spark 3.4, 1.3 = Spark 3.5)Align runtime version; test deprecated API calls
spark.read.synapsesql(...) connectorNot available — use notebookutils + Lakehouse shortcuts or Warehouse JDBCReplace with OneLake reads or SQL endpoint queries

Must / Prefer / Avoid

MUST DO

  • Replace all mssparkutils imports with notebookutils — see utility-api-mapping.md for the complete namespace table
  • Replace all Linked Services with Fabric Data Connections (for external databases/services) or OneLake Shortcuts (for ADLS Gen2 / Blob storage mounts) — see connectivity-migration.md
  • Replace spark.read.synapsesql() with Lakehouse shortcut reads or JDBC connections to the Fabric Warehouse SQL endpoint
  • Re-test all notebooks after migration against the target Fabric Runtime version — Spark minor version differences can surface deprecated API warnings
  • Externalize all workspace/item IDs — never hardcode; use pipeline parameters or Variable Libraries
  • Replace pool-level library installs with Fabric Environments attached at the workspace or notebook level

PREFER

  • OneLake Shortcuts over full data copies — mount existing ADLS Gen2 containers as shortcuts rather than re-ingesting data during migration
  • Fabric Starter Pool for dev/test migrations — eliminates pool warm-up wait time inherent in Synapse on-demand pools
  • Lakehouse SQL Endpoint as a drop-in for Serverless SQL Pool reads — point existing consumers at the endpoint with minimal query changes
  • Medallion architecture for migrated data — align with Bronze/Silver/Gold patterns (see e2e-medallion-architecture skill)
  • Incremental migration — migrate and validate workload by workload rather than performing a big-bang cutover
  • Parameterized notebooks to allow environment promotion (dev → test → prod) without code changes

AVOID

  • Do not copy-paste PolyBase CREATE EXTERNAL TABLE DDL into Fabric Warehouse — rewrite as COPY INTO or use Lakehouse for external data access
  • Do not assume Synapse Linked Service connection strings are reusable — credentials and endpoints must be reconfigured as Fabric Data Connections
  • Do not install libraries in notebook cells (%pip install at runtime) for production workloads — use Fabric Environments for reproducible, versioned library management
  • Do not migrate Dedicated SQL Pool distribution hints (HASH, ROUND_ROBIN, REPLICATE) verbatim — remove them; Fabric Warehouse handles distribution automatically
  • Do not use wasb:// or abfss://[email protected]/ paths as primary data paths — migrate data access to OneLake abfss://[email protected]/ paths

Examples

See code-patterns.md for full before/after examples. Key quick references:

mssparkutils.envnotebookutils.runtime

# Synapse
workspace = mssparkutils.env.getWorkspaceName()

# Fabric
workspace = notebookutils.runtime.context["workspaceName"]

Linked Service credential → Key Vault secret

# Synapse
conn = mssparkutils.credentials.getConnectionStringOrCreds("MyLinkedService")

# Fabric
conn = notebookutils.credentials.getSecret("https://myvault.vault.azure.net/", "my-secret")

Dedicated SQL Pool DDL → Fabric Warehouse DDL

-- Synapse (remove distribution hints)
CREATE TABLE dbo.Fact (...) WITH (DISTRIBUTION = HASH(id), CLUSTERED COLUMNSTORE INDEX);

-- Fabric Warehouse
CREATE TABLE dbo.Fact (...);

NotebookLM Web Importer

Nhập trang web và video YouTube vào NotebookLM chỉ với một cú nhấp. Được tin dùng bởi hơn 200.000 người dùng.

Cài đặt tiện ích Chrome