dv-query

작성자: microsoft

Python SDK와 Web API를 통해 Dataverse 데이터에 대한 대량 읽기, 다중 페이지 반복 및 분석을 수행합니다. 사용자가 읽기, 나열, 필터링, 집계 등을 원할 때 사용하세요.

npx skills add https://github.com/microsoft/dataverse-skills --skill dv-query

Skill: Query — Read and Analyze Dataverse Records

This skill uses Python exclusively. Do not use Node.js, JavaScript, or any other language for Dataverse scripting. See the overview skill's Hard Rules.

SDK-First Rule for Reads

All reads use the SDK — not urllib, requests, or raw HTTP. This is the same rule as dv-data's SDK-First Rule, applied to reads. If you find yourself writing urllib.request or get_token() for a query, STOP — the SDK handles it. The only exceptions are $apply aggregation and N:N $expand, documented below.

How to Answer Data Questions

When the user asks a question about their data, pick the approach by what they're asking, not by which API you know:

User asks...ApproachWhy
"show me open tickets" / simple filterMCP read_query (if available) or client.records.get() with $filterSmall result, no aggregation
"how many X" / simple countMCP read_query or client.records.get() with count=TrueSingle number
Single-table aggregation (most/sum/avg/top-N)$apply server-side aggregation (raw Web API)One HTTP call, returns only grouped results
Cross-table aggregationclient.dataframe.get() with minimal $select + pd.merge()Server can't join; pandas merge is fast with minimal columns
"show me X with related Y" / resolve lookupsclient.records.get() with $expand or QueryBuilder (b8+)Lookup resolution
"export this data" / bulk extractclient.dataframe.get() with select=Direct to DataFrame → CSV
"load into notebook" / interactive analysisclient.dataframe.get() or QueryBuilder .to_dataframe() (b8+)pandas native
"find duplicates" / complex filterclient.records.get() with $filter or QueryBuilder (b8+)SDK handles pagination
Simple filtered read (<5K rows)client.query.sql()Lightweight SQL SELECT with WHERE, ORDER BY, TOP

Key principle: Let the server do the work. For single-table aggregation, use $apply — it runs server-side and returns only grouped results. For cross-table questions, use client.dataframe.get() with minimal $select on each table, then pd.merge() — the merge itself is sub-second; the bottleneck is network transfer, which $select minimizes.

Always query the live Dataverse environment. Do not query local copies, cached files, or source databases when the user expects results from Dataverse. The data in Dataverse is the source of truth.


SQL Queries — client.query.sql()

client.query.sql() uses the Dataverse Web API ?sql= parameter — a limited SQL subset (same limitations as MCP read_query). It does NOT support GROUP BY, JOINs, HAVING, DISTINCT, or subqueries. Results are capped at ~5,000 rows.

When to use: Fast filtered reads on tables with <5K rows. For these, it's significantly faster (~2-6s) than page iteration or DataFrames because it's a single HTTP call.

# Fast filtered read on small tables (<5K rows)
results = client.query.sql(
    "SELECT TOP 100 name, estimatedvalue "
    "FROM opportunity "
    "WHERE statecode = 0 "
    "ORDER BY estimatedvalue DESC"
)
for r in results:
    print(f"{r['name']}: ${r.get('estimatedvalue', 0):,.0f}")

Do NOT use for: Tables >5K rows (results silently truncated), aggregation (no GROUP BY), or cross-table queries (no JOINs). Use $apply for single-table aggregation and client.dataframe.get() + pd.merge() for cross-table.

Skill boundaries

NeedUse instead
Create, update, delete recordsdv-data
Create tables, columns, relationshipsdv-metadata
Export or deploy solutionsdv-solution

Setup

import os, sys
sys.path.insert(0, os.path.join(os.getcwd(), "scripts"))
from auth import get_client

# get_client sets a plugin attribution context on the User-Agent header.
# Do not modify the context value — it is a closed schema for server-side
# telemetry (app/skill/agent). Never include secrets or PII.
client = get_client("dv-query")

get_client(skill) handles auth, environment URL, and plugin attribution (User-Agent tagging). See scripts/auth.py. For scripts that run to completion, wrap the returned client in a with statement for automatic connection cleanup.


Field Name Casing Rule

Getting this wrong causes 400 errors.

Property typeConventionExampleWhen used
Structural (columns)LogicalName — always lowercasenew_name, new_priority$select, $filter, $orderby
Navigation (lookups)Navigation Property Name — case-sensitive, matches $metadatanew_AccountId$expand
  • System table navigation properties (e.g., parentaccountid, ownerid): lowercase
  • Custom lookup navigation properties: case-sensitive, match $metadata SchemaName (e.g., new_AccountId)

Query Records (multi-page)

client.records.get() is the primary read method — works on all SDK versions (b6+). It returns a page iterator for multi-record queries and a single Record for by-GUID fetch. Always use select= to limit columns.

for page in client.records.get(
    "new_ticket",
    select=["new_name", "new_priority", "new_status"],
    filter="new_status eq 100000000",
    orderby=["new_name asc"],
    top=50,
):
    for r in page:
        print(r["new_name"], r["new_priority"])

client.records.get() returns a page iterator — always iterate pages and then records within each page. Each record is a Record object that supports dict-like access: r["column"], r.get("column"), r.keys(). Do not use r.data.get() — use r.get() directly.


Fetch a Single Record by ID

record = client.records.get("new_ticket", "<record-guid>",
    select=["new_name", "new_priority", "new_status"])
print(record["new_name"])

$select with Lookup Columns (GUID-free display)

To show display names instead of GUIDs, request the formatted value annotation via include_annotations:

for page in client.records.get("opportunity",
    select=["name", "estimatedvalue", "_parentaccountid_value"],
    include_annotations="OData.Community.Display.V1.FormattedValue",
):
    for r in page:
        account_name = r.get("[email protected]")
        print(f"{r['name']} — {account_name}")

You MUST pass include_annotations — without it, the Prefer: odata.include-annotations header is not sent and formatted values are not in the response. Use "*" for all annotations or the specific annotation name above.

Formatted values are available for lookup, choice, status, and owner fields.


$expand — Resolve Lookup to Full Related Record

for page in client.records.get("opportunity",
    select=["name", "estimatedvalue"],
    expand=["parentaccountid($select=name)"],   # nested $select avoids fetching all account columns
):
    for r in page:
        account = r.get("parentaccountid") or {}
        print(f"{r['name']} — {account.get('name', 'Unknown')}")

Always use nested $select inside $expand — without it, Dataverse returns every column on the related entity, which wastes bandwidth and memory.

$expand with multiple custom lookups

for page in client.records.get(
    "new_ticket",
    select=["new_name", "new_priority", "new_status"],
    expand=["new_CustomerId($select=new_name)", "new_AgentId($select=new_name)"],  # nested $select + case-sensitive nav props
):
    for r in page:
        customer = r.get("new_CustomerId") or {}
        agent    = r.get("new_AgentId") or {}
        print(f"{r['new_name']} | {customer.get('new_name','')} | {agent.get('new_name','')}")

expand uses the Navigation Property Name (new_CustomerId), not the lowercase logical name (new_customerid). Using lowercase causes a 400 error.


Advanced query patterns (Web API only)

For aggregations and many-to-many expansion, the SDK doesn't have direct support — use raw Web API. See references/web-api-advanced.md for full code samples.

Quick reference:

  • $expand on N:N relationships: GET /<entitySet>?$expand=<n:n_nav>($select=...) — single page only; follow @odata.nextLink for >5,000 results.
  • $apply for aggregations: runs server-side, returns grouped results in one call. Patterns: groupby((col),aggregate(metric with sum as total)), aggregate($count as count), aggregate(amount with average as avg). 50K source-record limit.
  • Cross-table aggregation: $apply only works within one entity set. Use client.dataframe.get(entity, select=[...]) per table → pd.merge()groupby(). Always pass select=; without it transfers 10-20× more data.

QueryBuilder — Fluent Query API (SDK b8+)

Available in PowerPlatform-Dataverse-Client b8+. Chainable builder for complex queries that would be awkward as a single OData URL or FetchXML string. Full reference and examples in references/querybuilder.md.

Jupyter Notebook Setup

For interactive querying in notebooks (auth + DataverseClient + DataFrame display), see references/jupyter-setup.md.

Common Query Errors

StatusCauseFix
400Wrong field casing in $select/$filter (must be lowercase LogicalName) or $expand (must be case-sensitive Navigation Property Name)Verify names via EntityDefinitions(LogicalName='...')/Attributes
400Unsupported SQL in MCP read_query or client.query.sql() (DISTINCT, HAVING, subqueries, OFFSET, JOINs, GROUP BY)Use $apply for single-table aggregation, or client.dataframe.get() + pandas for cross-table
404Table logical name not foundCheck spelling — use client.tables.get("<name>") to verify
429Rate limitedSDK retries automatically; reduce page size or add delays between pages

For HttpError handling in SDK scripts, see the error handling pattern in dv-data.


Windows Scripting Notes

  • ASCII only in .py files — curly quotes and em dashes cause SyntaxError on Windows.
  • No python -c for multiline code — write a .py file instead.
  • Generate GUIDs in scripts: str(uuid.uuid4()), not shell backtick substitution.

microsoft의 다른 스킬

oss-growth
microsoft
OSS 성장 해커 페르소나
official
microsoft-foundry
microsoft
Foundry 에이전트를 엔드투엔드로 배포, 평가 및 관리: Docker 빌드, ACR 푸시, 호스팅/프롬프트 에이전트 생성, 컨테이너 시작, 배치 평가, 지속적 평가, 프롬프트 최적화 워크플로, agent.yaml, 트레이스에서 데이터셋 큐레이션. 용도: Foundry에 에이전트 배포, 호스팅 에이전트, 에이전트 생성, 에이전트 호출, 에이전트 평가, 배치 평가 실행, 지속적 평가, 지속적 모니터링, 지속적 평가 상태, 프롬프트 최적화, 프롬프트 개선, 프롬프트 최적화 도구, 에이전트 지침 최적화, 에이전트 개선...
officialdevelopmentdevops
azure-ai
microsoft
Azure AI: Search, Speech, OpenAI, Document Intelligence에 사용됩니다. 검색, 벡터/하이브리드 검색, 음성-텍스트 변환, 텍스트-음성 변환, 전사, OCR을 지원합니다. 사용 시점: AI Search, 쿼리 검색, 벡터 검색, 하이브리드 검색, 의미 검색, 음성-텍스트 변환, 텍스트-음성 변환, 전사, OCR, 텍스트를 음성으로 변환.
officialdevelopmentapi
azure-deploy
microsoft
이미 준비된 애플리케이션에 대해 기존 .azure/deployment-plan.md 및 인프라 파일이 있는 경우 Azure 배포를 실행합니다. 사용자가 새 애플리케이션 생성을 요청할 때는 이 스킬을 사용하지 말고 azure-prepare를 사용하세요. 이 스킬은 azd up, azd deploy, terraform apply, az deployment 명령을 내장된 오류 복구 기능과 함께 실행합니다. azure-prepare의 .azure/deployment-plan.md와 azure-validate의 검증 상태가 필요합니다. 사용 시점: "run azd up", "run azd deploy", "execute deployment",...
officialdevopsaws
azure-storage
microsoft
Azure Storage Services는 Blob Storage, File Shares, Queue Storage, Table Storage, Data Lake를 포함합니다. 스토리지 액세스 계층(hot, cool, cold, archive), 각 계층 사용 시기 및 계층 비교에 대한 질문에 답변합니다. 객체 스토리지, SMB 파일 공유, 비동기 메시징, NoSQL 키-값, 빅데이터 분석을 제공합니다. 수명 주기 관리를 포함합니다. 사용 용도: blob 스토리지, 파일 공유, 큐 스토리지, 테이블 스토리지, 데이터 레이크, 파일 업로드, blob 다운로드, 스토리지 계정, 액세스 계층,...
officialdevelopmentdatabase
azure-diagnostics
microsoft
Azure에서 AppLens, Azure Monitor, 리소스 상태 및 안전한 트라이지를 사용하여 Azure 프로덕션 문제를 디버그합니다. 사용 시기: 프로덕션 문제 디버그, 앱 서비스 문제 해결, 앱 서비스 높은 CPU, 앱 서비스 배포 실패, 컨테이너 앱 문제 해결, 함수 문제 해결, AKS 문제 해결, kubectl 연결 불가, kube-system/CoreDNS 오류, pod 보류 중, crashloop, 노드 준비 안 됨, 업그레이드 실패, 로그 분석, KQL, 인사이트, 이미지 풀 실패, 콜드 스타트 문제, 상태 프로브 실패,...
officialdevopsdevelopment
azure-prepare
microsoft
Azure 앱을 배포용으로 준비합니다(인프라 Bicep/Terraform, azure.yaml, Dockerfiles). 생성/현대화 또는 생성+배포에 사용하며, 크로스 클라우드 마이그레이션에는 사용하지 않습니다(azure-cloud-migrate 사용). 다음에는 사용하지 마십시오: copilot-sdk 앱(azure-hosted-copilot-sdk 사용). 사용 시점: "앱 생성", "웹 앱 빌드", "API 생성", "서버리스 HTTP API 생성", "프론트엔드 생성", "백엔드 생성", "서비스 빌드", "애플리케이션 현대화", "애플리케이션 업데이트", "인증 추가", "캐싱 추가", "Azure에 호스팅", "생성 및...
officialdevelopmentdevops
azure-validate
microsoft
Azure 배포 전 준비 상태 검증. 구성, 인프라(Bicep 또는 Terraform), RBAC 역할 할당, 관리 ID 권한, 사전 요구 사항에 대한 심층 점검을 실행합니다. 사용 시점: 내 앱 검증, 배포 준비 상태 확인, 사전 점검 실행, 구성 확인, 배포 가능 여부 확인, azure.yaml 검증, Bicep 검증, 배포 전 테스트, 배포 오류 문제 해결, Azure Functions 검증, 함수 앱 검증, 서버리스 검증...
officialdevopstesting