dv-data

作者: microsoft

通過 Python SDK 進行記錄層級的 CRUD 與批次操作 — 建立、更新、刪除、更新插入、CSV 匯入、多表外鍵載入、AI 生成的範例資料。…

npx skills add https://github.com/microsoft/dataverse-skills --skill dv-data

Skill: Data — Create, Update, Delete, and Bulk Import

This skill uses Python exclusively. Do not use Node.js, JavaScript, or any other language for Dataverse scripting. If you are about to run npm install or write a .js file, STOP — you are going off-rails. See the overview skill's Hard Rules.

Use the official Microsoft Power Platform Dataverse Client Python SDK for all data write operations.

Official SDK: https://github.com/microsoft/PowerPlatform-DataverseClient-Python PyPI package: PowerPlatform-Dataverse-Client (this is the only official one — do not use dataverse-api or other unofficial packages) Status: Preview — breaking changes are possible

Skill boundaries

NeedUse instead
Query or read recordsdv-query
Create tables, columns, relationships, forms, viewsdv-metadata
Export or deploy solutionsdv-solution

Before Writing ANY Script — Check MCP First

If MCP tools are available (create_record, update_record) and the task is ≤10 records, use MCP directly — no script needed. Only write a Python script when the task requires: bulk operations (10+ records), data transformation, retry logic, CSV import, or operations the SDK supports that MCP cannot (upsert, file uploads). Sequential MCP tool calls are not "multi-step logic" — use MCP for those.

SDK-First Rule

If an operation is in the "supports" list below, you MUST use the SDK — not urllib, requests, or raw HTTP.

Correct import (always preceded by sys.path.insert in a full script — see Setup below):

from auth import get_client

WRONG for SDK-supported operations:

from auth import get_token, load_env  # WRONG for SDK-supported ops
import requests                        # WRONG for SDK-supported ops

get_token() and requests exist ONLY for operations the SDK does not support (forms, views, $apply, N:N $expand, unbound actions) — see dv-query and dv-metadata.


What This SDK Supports (Data Operations)

  • Record writes: create, update, delete
  • Record reads within write workflows (e.g., lookup resolution) — for standalone queries see dv-query
  • Upsert (with alternate key support)
  • Bulk operations: CreateMultiple, UpdateMultiple, UpsertMultiple
  • File column uploads (chunked for files >128MB)
  • Context manager with HTTP connection pooling

What This SDK Does NOT Support

Use raw Web API (get_token()) for:

  • Forms (FormXml) — see dv-metadata
  • Views (SavedQueries) — see dv-metadata
  • Global option sets — see dv-metadata
  • N:N record association ($ref POST) — use raw Web API (POST /api/data/v9.2/<entity>(<id>)/<nav-property>/$ref)
  • N:N $expand — see dv-query
  • $apply aggregation — see dv-query
  • Unbound actions (e.g., InstallSampleData)
  • DeleteMultiple, general OData batching

Setup

import os, sys
sys.path.insert(0, os.path.join(os.getcwd(), "scripts"))
from auth import get_client

# get_client sets a plugin attribution context on the User-Agent header.
# Do not modify the context value — it is a closed schema for server-side
# telemetry (app/skill/agent). Never include secrets or PII.
client = get_client("dv-data")

get_client(skill) handles auth, environment URL, and plugin attribution (User-Agent tagging). See scripts/auth.py.

For scripts that run to completion: wrap in with DataverseClient(...) as client: for automatic connection cleanup (recommended since b6). For notebooks and interactive sessions, the explicit client above is simpler.


Field Name Casing Rule

Getting this wrong causes 400 errors.

Property typeConventionExampleWhen used
Structural (columns)LogicalName — always lowercasenew_name, new_priorityRecord payload keys
Navigation (lookups)Navigation Property Name — case-sensitive, matches $metadatanew_AccountId@odata.bind keys

The SDK lowercases structural keys automatically but preserves @odata.bind key casing.


Create a Record

guid = client.records.create("new_ticket", {
    "new_name": "Ticket 001",
    "new_priority": 100000002,          # choice column — integer value, not string
    "[email protected]": "/accounts(<account-guid>)",
})
print(f"Created: {guid}")

@odata.bind notes:

  • Key is the Navigation Property Name: [email protected] (the SDK preserves casing automatically as of b6, but matching the schema name is still the correct form)
  • Value is "/<EntitySetName>(<guid>)" — e.g., "/accounts(<guid>)"
  • If you just created the lookup column, wait 5–10 seconds before inserting. Metadata propagation delays cause "Invalid property" errors.
  • Choice columns use integer values, not strings: "new_priority": 100000002 (not "High")

Common @odata.bind patterns

LookupCorrect keyWrong
Custom: new_AccountId[email protected][email protected]
System polymorphic: customerid[email protected][email protected]
System: parentcustomerid[email protected][email protected]

Find the Navigation Property Name

After creating a lookup via SDK: result.lookup_schema_name is the navigation property name.

For existing system tables, query:

GET /api/data/v9.2/EntityDefinitions(LogicalName='<entity>')/ManyToOneRelationships
  ?$select=ReferencingEntityNavigationPropertyName,ReferencedEntity

Update a Record

client.records.update("new_ticket", "<record-guid>",
    {"new_status": 100000001})

Delete a Record

client.records.delete("new_ticket", "<record-guid>")

Bulk Create (SDK uses CreateMultiple internally)

records = [{"new_name": f"Ticket {i}", "new_priority": 100000000} for i in range(500)]
guids = client.records.create("new_ticket", records)
print(f"Created {len(guids)} records")

Volume guidance: MCP create_record for 1-10 records. SDK for 10+ records.

Important: The SDK sends all records in a single POST to CreateMultiple. It does not chunk automatically. Dataverse has no fixed record count limit — the constraints are payload size and request timeout (SDK default: 120s for POST). For larger datasets, you must chunk in your script. The bulk_upsert and bulk_create helpers below use adaptive chunking: start at 1,000, double on success (up to 4,000), halve on payload/timeout failure, and cap at the last successful size. Tables with few columns can handle larger chunks than tables with many columns.


Bulk Update

# Broadcast same change to multiple records
client.records.update("new_ticket",
    [id1, id2, id3],
    {"new_status": 100000001})

DataFrame Write-Back

To create or update records from a pandas DataFrame, use the client.dataframe namespace. This is documented in dv-query (alongside client.dataframe.get()) but is a write operation — include it in your data write workflow:

# Update records — DataFrame must include the primary key column
client.dataframe.update("opportunity", df_updates, id_column="opportunityid")

# Create records — returns a Series of new GUIDs
guids = client.dataframe.create("opportunity", df_new_records)

See dv-query for the full client.dataframe reference including client.dataframe.get().


Upsert (Alternate Keys)

Idempotent — re-running the same import does not create duplicates. The alternate key must be defined on the table first — see dv-metadata.

Do NOT include alternate key columns in the record body. The alternate key identifies the record; the record body contains the data to set. If the same column appears in both, UpsertMultiple fails with "An unexpected error occurred" (single upsert tolerates it, bulk does not).

from PowerPlatform.Dataverse.models.upsert import UpsertItem

client.records.upsert("account", [
    UpsertItem(
        alternate_key={"accountnumber": "ACC-001"},
        record={"name": "Contoso Ltd", "description": "Primary account"},
    ),
    UpsertItem(
        alternate_key={"accountnumber": "ACC-002"},
        record={"name": "Fabrikam Inc"},
    ),
])

Bulk Import from CSV

For imports that may be re-run (most real-world cases), use UpsertItem with alternate keys instead of create() — see references/multi-table-fk-import.md. The create() pattern here is for one-shot loads only.

VolumeToolWhy
1–10 recordsMCP create_recordSimple, no script
10+ recordsSDK client.records.create(table, list)Uses CreateMultiple; chunk large datasets (start at 1K, adapt)
import csv, os, sys
sys.path.insert(0, os.path.join(os.getcwd(), "scripts"))
from auth import get_client

# get_client sets a plugin attribution context on the User-Agent header.
# Do not modify the context value — it is a closed schema for server-side
# telemetry (app/skill/agent). Never include secrets or PII.
client = get_client("dv-data")

with open("data/customers.csv", newline="", encoding="utf-8") as f:
    rows = list(csv.DictReader(f))

records = [{"new_name": row["name"], "new_email": row["email"]} for row in rows]

# SDK sends all in one POST — chunk to avoid payload/timeout limits
# Start at 1000; for narrow tables (few columns) you can go higher
chunk_size = 1000
for i in range(0, len(records), chunk_size):
    guids = client.records.create("new_customer", records[i:i + chunk_size])
    print(f"Imported {i + len(guids)}/{len(records)} customers", flush=True)

Lookup resolution during import

If the CSV has a human-readable key (e.g., customer_email) but Dataverse needs a GUID, pre-resolve with a lookup dict:

# Build email -> GUID map first
email_to_guid = {}
for page in client.records.get("new_customer", select=["new_customerid", "new_email"]):
    for r in page:
        email_to_guid[r["new_email"]] = r["new_customerid"]

# Use it during import
records = []
for row in rows:
    customer_guid = email_to_guid.get(row["customer_email"])
    if not customer_guid:
        print(f"Skipping row — unknown email: {row['customer_email']}")
        continue
    records.append({
        "new_channel": row["channel"],
        "[email protected]": f"/new_customers({customer_guid})",  # verify entity set name via EntityDefinitions
    })

guids = client.records.create("new_interaction", records)

Required field discovery for system tables

Before bulk-creating in a system table (account, contact, opportunity):

  1. Create a single test record with your intended minimal payload
  2. If HttpError 400 is raised, the error message names the missing required field
  3. Some required fields are plugin-enforced and not visible in describe_table
  4. Delete the test record, then proceed with bulk create

Multi-Table Import with FK Dependencies

When importing data across multiple tables with foreign key relationships, the import must run in dependency order with UpsertItem + alternate keys (idempotent, safe for re-runs).

Quick reference:

  1. Create tables with source ID columns + alternate keys + lookup relationships (see dv-metadata).
  2. Import Level 0 (no FK deps) tables in parallel via ThreadPoolExecutor. Sequential chunks within each table (concurrent writes deadlock).
  3. Build source-ID → GUID maps by querying back (upsert doesn't return GUIDs).
  4. Repeat per dependency level — Level 1 needs Level 0's maps for @odata.bind.

For the full pattern — adaptive bulk_upsert helper, composite-key handling, post-import verification, and the first-time bulk_create variant — see references/multi-table-fk-import.md.

Key invariants (apply even without reading the reference):

  • Parallelize across tables at the same level, sequential between levels, sequential chunks within a table.
  • Alternate key columns must NOT also appear in the record bodyUpsertMultiple fails.
  • Catch per-table failures in the executor — one table failing must not kill the others.
  • Start chunk_size=1000; the helper ramps up adaptively.

Error Handling

from PowerPlatform.Dataverse.core.errors import HttpError

try:
    guid = client.records.create("new_ticket", {"new_name": "Test"})
except HttpError as e:
    print(f"Status {e.status_code}: {e.message}")
    if e.details:
        print(f"Details: {e.details}")
    # 400 — bad field name, @odata.bind format, or missing required field
    # 403 — check security roles
    # 404 — table or record not found
    # 429 — rate limited; SDK retries automatically, reduce batch size if persistent

Windows Scripting Notes

  • ASCII only in .py files — curly quotes and em dashes cause SyntaxError on Windows.
  • No python -c for multiline code — write a .py file instead.
  • Generate GUIDs in scripts: str(uuid.uuid4()), not shell backtick substitution.

Sample Data Generation

Generate realistic sample records inline — schema-driven, table-agnostic, PII-safe defaults (@example.com emails, 555-01xx phones).

Quick reference: confirm environment + count + table → query EntityDefinitions(LogicalName='<table>')/Attributes?$filter=AttributeOf eq null for required columns → dispatch by AttributeType (String / Memo / Integer / DateTime / Picklist / etc.) → client.records.create() (use CreateMultiple for count >= 10).

For the schema-driven fake() template, the EntityDefinitions query, and the safety rules, see references/sample-data-generation.md.

Key invariants:

  • Skip Lookup, Uniqueidentifier, State, Status, Owner, Customer fields unless the user explicitly provides values.
  • UserLocalizedLabel may be null — dereference safely.

Confirmation-flow examples

Generate N sample records (destructive — preview the snippet, ask for env):

  • ❌ "Which environment should I target? Please provide the Dataverse URL."
  • ✅ "I'll run the Sample Data Generation snippets with TABLE=\"contact\", COUNT=20. Uses CreateMultiple, .example.com emails, 555-01xx phones, against the active pac auth list environment. Confirm to proceed, or specify a different environment."

Sample data on a custom entity (schema unknown — prose is enough):

  • ❌ "I need more info about the entity. What are the required fields?"
  • ✅ "Custom entity — I'll query EntityDefinitions for cr123_project to discover required columns, then generate 5 records inline mapping each column to a generator by AttributeType and call client.records.create(\"cr123_project\", records). Confirm to proceed, or tell me a different count."

來自 microsoft 的更多技能

oss-growth
microsoft
開源增長駭客角色
official
microsoft-foundry
microsoft
端到端部署、評估與管理 Foundry 代理:Docker 建置、ACR 推送、託管/提示代理建立、容器啟動、批次評估、持續評估、提示最佳化工作流程、agent.yaml、從追蹤資料集整理。用途:將代理部署至 Foundry、託管代理、建立代理、調用代理、評估代理、執行批次評估、持續評估、持續監控、持續評估狀態、最佳化提示、改善提示、提示最佳化器、最佳化代理指令、改善代理...
officialdevelopmentdevops
azure-ai
microsoft
用於 Azure AI:搜尋、語音、OpenAI、文件智慧。協助搜尋、向量/混合搜尋、語音轉文字、文字轉語音、轉錄、OCR。適用情境:AI 搜尋、查詢搜尋、向量搜尋、混合搜尋、語意搜尋、語音轉文字、文字轉語音、轉錄、OCR、將文字轉換為語音。
officialdevelopmentapi
azure-deploy
microsoft
對已準備好的應用程式執行 Azure 部署,這些應用程式需具備現有的 .azure/deployment-plan.md 與基礎架構檔案。當使用者要求建立新應用程式時,請勿使用此技能——應改用 azure-prepare。此技能會執行 azd up、azd deploy、terraform apply 及 az deployment 命令,並內建錯誤復原機制。需具備來自 azure-prepare 的 .azure/deployment-plan.md,以及來自 azure-validate 的驗證狀態。適用時機:「執行 azd up」、「執行 azd deploy」、「執行部署」……
officialdevopsaws
azure-storage
microsoft
Azure Storage Services 包括 Blob 儲存體、檔案共用、佇列儲存體、表格儲存體和 Data Lake。回答關於儲存存取層(熱、冷、凍結、封存)、各層使用時機及層級比較的問題。提供物件儲存、SMB 檔案共用、非同步訊息、NoSQL 鍵值及大數據分析。包含生命週期管理。用於:blob 儲存體、檔案共用、佇列儲存體、表格儲存體、data lake、上傳檔案、下載 blob、儲存帳戶、存取層...
officialdevelopmentdatabase
azure-diagnostics
microsoft
在 Azure 上使用 AppLens、Azure Monitor、資源健康狀態和安全分類來偵錯 Azure 生產問題。適用時機:偵錯生產問題、疑難排解應用程式服務、應用程式服務高 CPU、應用程式服務部署失敗、疑難排解容器應用程式、疑難排解函數、疑難排解 AKS、kubectl 無法連線、kube-system/CoreDNS 失敗、Pod 擱置、CrashLoop、節點未就緒、升級失敗、分析記錄、KQL、深入解析、映像提取失敗、冷啟動問題、健康狀態探查失敗...
officialdevopsdevelopment
azure-prepare
microsoft
準備 Azure 應用程式以進行部署(基礎架構 Bicep/Terraform、azure.yaml、Dockerfile)。用於建立/現代化或建立+部署;不適用於跨雲端遷移(請使用 azure-cloud-migrate)。請勿用於:copilot-sdk 應用程式(請使用 azure-hosted-copilot-sdk)。適用時機:「建立應用程式」、「建置 Web 應用程式」、「建立 API」、「建立無伺服器 HTTP API」、「建立前端」、「建立後端」、「建置服務」、「現代化應用程式」、「更新應用程式」、「新增驗證」、「新增快取」、「託管於 Azure」、「建立並...」
officialdevelopmentdevops
azure-validate
microsoft
部署前驗證 Azure 就緒狀態。對設定、基礎架構(Bicep 或 Terraform)、RBAC 角色指派、受控身分權限及先決條件進行深度檢查,再進行部署。適用時機:驗證我的應用程式、檢查部署就緒狀態、執行預檢檢查、驗證設定、確認是否可部署、驗證 azure.yaml、驗證 Bicep、部署前測試、疑難排解部署錯誤、驗證 Azure Functions、驗證函式應用程式、驗證無伺服器...
officialdevopstesting