dataflows-consumption-cli
अद्यतन जाँच — प्रति सत्र एक बार (अनिवार्य) जब इस कौशल का पहली बार किसी सत्र में उपयोग किया जाए, तो आगे बढ़ने से पहले अद्यतन-जाँच कौशल चलाएँ।
npx skills add https://github.com/microsoft/skills-for-fabric --skill dataflows-consumption-cliUpdate Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.
- GitHub Copilot CLI / VS Code: invoke the
check-updatesskill.- Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.
- Skip if the check was already performed earlier in this session.
CRITICAL NOTES
- To find the workspace details (including its ID) from workspace name: list all workspaces and, then, use JMESPath filtering
- To find a dataflow by name: list all dataflows in the workspace and filter by
displayNameclient-side — there is no server-side name filtergetDefinitionis a POST, not GET — even though it reads data
dataflows-consumption-cli — Dataflows Gen2 Consumption via CLI
Table of Contents
| Task | Reference | Notes |
|---|---|---|
| Finding Workspaces and Items in Fabric | COMMON-CLI.md § Finding Workspaces and Items in Fabric | Mandatory — READ link first |
| Fabric Topology & Key Concepts | COMMON-CORE.md § Fabric Topology & Key Concepts | |
| Environment URLs | COMMON-CORE.md § Environment URLs | |
| Authentication & Token Acquisition | COMMON-CORE.md § Authentication & Token Acquisition | Wrong audience = 401; read before any auth issue |
| Core Control-Plane REST APIs | COMMON-CORE.md § Core Control-Plane REST APIs | Includes pagination, LRO polling, and rate-limiting patterns |
| Job Execution | COMMON-CORE.md § Job Execution | |
| Gotchas, Best Practices & Troubleshooting | COMMON-CORE.md § Gotchas, Best Practices & Troubleshooting | |
| Tool Selection Rationale | COMMON-CLI.md § Tool Selection Rationale | |
| Authentication Recipes | COMMON-CLI.md § Authentication Recipes | az login flows and token acquisition |
Fabric Control-Plane API via az rest | COMMON-CLI.md § Fabric Control-Plane API via az rest | Always pass --resource; includes pagination and LRO helpers |
| Job Execution (CLI) | COMMON-CLI.md § Job Execution | |
| Gotchas & Troubleshooting (CLI-Specific) | COMMON-CLI.md § Gotchas & Troubleshooting (CLI-Specific) | az rest audience, shell escaping, token expiry |
| Quick Reference | COMMON-CLI.md § Quick Reference | az rest template + token audience/tool matrix |
| Consumption Capability Matrix | DATAFLOWS-CONSUMPTION-CORE.md § Consumption Capability Matrix | Read first — shows what ops are available |
| REST API Surface (Consumption) | DATAFLOWS-CONSUMPTION-CORE.md § REST API Surface | List, Get, Parameters, getDefinition, Jobs |
| Dataflow Definition Exploration | DATAFLOWS-CONSUMPTION-CORE.md § Dataflow Definition Exploration | Decode mashup.pq, queryMetadata.json, .platform |
| Parameter Discovery and Analysis | DATAFLOWS-CONSUMPTION-CORE.md § Parameter Discovery and Analysis | Types, formats, M code patterns |
| Refresh and Job Monitoring | DATAFLOWS-CONSUMPTION-CORE.md § Refresh and Job Monitoring | LRO pattern, job instances, polling best practices |
| Agentic Exploration Pattern | DATAFLOWS-CONSUMPTION-CORE.md § Agentic Exploration Pattern | 6-step discovery sequence |
| Security and Permissions Model | DATAFLOWS-CONSUMPTION-CORE.md § Security and Permissions Model | Permission matrix by operation |
| Common Errors | DATAFLOWS-CONSUMPTION-CORE.md § Common Errors | Error codes and resolutions |
| Gotchas and Troubleshooting Reference | DATAFLOWS-CONSUMPTION-CORE.md § Gotchas and Troubleshooting | 12 numbered issues with cause + resolution |
| Quick Reference One-Liners | consumption-cli-quickref.md | az rest one-liners for all consumption ops |
| Discovery Patterns | discovery-queries.md | Definition decoding, parameter extraction, connection analysis |
| Script Templates | script-templates.md | Copy-paste bash and PowerShell templates |
| Tool Stack | SKILL.md § Tool Stack | |
| Connection | SKILL.md § Connection | |
| Agentic Exploration ("Chat With My Dataflows") | SKILL.md § Agentic Exploration | Start here for dataflow exploration |
| Query Execution | SKILL.md § Query Evaluation | Execute individual queries; responses are Apache Arrow binary |
Tool Stack
| Tool | Role | Install |
|---|---|---|
az CLI | Primary: Auth (az login), Fabric REST API via az rest | Pre-installed in most dev environments |
curl | Alternative HTTP client for REST calls | Pre-installed |
jq | Parse JSON responses, extract fields, format output | Pre-installed or trivial |
base64 | Decode definition parts from base64 | Built into bash; PowerShell uses [Convert]::FromBase64String |
bash/pwsh | Script execution | Pre-installed |
Agent check — verify before first operation:
az account show >/dev/null 2>&1 || echo "RUN: az login" command -v jq >/dev/null 2>&1 || echo "INSTALL: apt-get install jq OR brew install jq"
Connection
Resolve Workspace ID and Dataflow ID
Per COMMON-CLI.md Finding Workspaces and Items in Fabric:
# Find workspace ID by name
WS_ID=$(az rest --method get \
--resource "https://api.fabric.microsoft.com" \
--url "https://api.fabric.microsoft.com/v1/workspaces" \
--query "value[?displayName=='My Workspace'].id" --output tsv)
# Find dataflow ID by name within workspace
DF_ID=$(az rest --method get \
--resource "https://api.fabric.microsoft.com" \
--url "https://api.fabric.microsoft.com/v1/workspaces/$WS_ID/dataflows" \
--query "value[?displayName=='Sales Data Pipeline'].id" --output tsv)
Reusable Connection Variables
# Set once at script top
WS_ID="<workspaceId>"
DF_ID="<dataflowId>"
API="https://api.fabric.microsoft.com/v1"
AZ="az rest --resource https://api.fabric.microsoft.com"
Agentic Exploration ("Chat With My Dataflows")
Discovery Sequence
Run these in order to fully explore a workspace's dataflows. See references/discovery-queries.md for extended patterns.
# 1. List workspaces → find target
az rest --method get --resource "https://api.fabric.microsoft.com" \
--url "$API/workspaces" --query "value[].{name:displayName, id:id}" -o table
# 2. List dataflows → enumerate all
az rest --method get --resource "https://api.fabric.microsoft.com" \
--url "$API/workspaces/$WS_ID/dataflows" \
--query "value[].{name:displayName, id:id, desc:description}" -o table
# 3. Get dataflow properties
az rest --method get --resource "https://api.fabric.microsoft.com" \
--url "$API/workspaces/$WS_ID/dataflows/$DF_ID"
# 4. Discover parameters
az rest --method get --resource "https://api.fabric.microsoft.com" \
--url "$API/workspaces/$WS_ID/dataflows/$DF_ID/parameters" \
--query "value[].{name:name, type:type, required:isRequired, default:defaultValue}" -o table
# 5. Get definition → decode mashup.pq
RESPONSE=$(az rest --method post --resource "https://api.fabric.microsoft.com" \
--url "$API/workspaces/$WS_ID/dataflows/$DF_ID/getDefinition")
echo "$RESPONSE" | jq -r '.definition.parts[] | select(.path=="mashup.pq") | .payload' | base64 --decode
# 6. Check job history
az rest --method get --resource "https://api.fabric.microsoft.com" \
--url "$API/workspaces/$WS_ID/dataflows/$DF_ID/jobs/instances" \
--query "value[].{status:status, type:invokeType, start:startTimeUtc, end:endTimeUtc, error:failureReason}" -o table
Agentic Workflow
- Discover → Run Steps 1–3 to list and identify dataflows.
- Parameters → Step 4 to understand inputs and defaults.
- Definition → Step 5 to inspect M queries, connections, staging config.
- Monitor → Step 6 for refresh history and error patterns.
- Iterate → Drill into specific queries or connection details.
- Present → Summarize findings or generate a reusable script (see script-templates.md).
Gotchas, Rules, Troubleshooting
For full platform gotchas: DATAFLOWS-CONSUMPTION-CORE.md Gotchas and Troubleshooting Reference and COMMON-CLI.md Gotchas & Troubleshooting (CLI-Specific).
MUST DO
- Always
az loginfirst —az restuses the active session. No session → cryptic failure. - Always
--resource "https://api.fabric.microsoft.com"— wrong audience = 401. - Handle pagination — repeat requests with
continuationTokenuntil absent/null. - Handle LRO for
getDefinition— may return202 AcceptedwithLocationheader; poll until complete. - Decode base64 before inspecting — definition parts are base64-encoded.
- Use POST for
getDefinition— it is NOT a GET endpoint.
AVOID
- Hardcoded GUIDs — always discover via list-then-filter pattern.
- Assuming
getDefinitionis GET — it is POST (common mistake). - Ignoring pagination — list endpoints may return partial results.
- Polling too aggressively — respect
Retry-Afterheaders on 429s. - Expecting
getDefinitionwith Viewer role — requires Read+Write (Contributor+).
PREFER
az restover rawcurl— handles auth automatically.- List-then-filter pattern — no server-side name filter for dataflows.
- Exponential backoff for job polling — 5s → 10s → 20s → 30s cap.
jqfor response parsing — cleaner than shell string manipulation.- JMESPath
--queryfor simple field extraction directly inaz rest. - Env vars (
WS_ID,DF_ID,API) for script reuse.
TROUBLESHOOTING
| Symptom | Cause | Fix |
|---|---|---|
401 Unauthorized | Token expired or wrong audience | az login; ensure --resource "https://api.fabric.microsoft.com" |
403 Forbidden on getDefinition | Viewer role (Read-only) | Requires Contributor role or higher (Read+Write) |
404 Not Found | Wrong workspace or dataflow ID | Re-discover via List Dataflows API |
getDefinition returns 202 | Large definition or server load | Poll the Location header URL until operation completes |
| Empty parameters array | Dataflow has no parameters | Expected behavior — check mashup.pq for IsParameterQuery |
| Base64 decode shows garbled text | BOM in encoded content | Strip UTF-8 BOM (\xEF\xBB\xBF) when decoding |
429 TooManyRequests | Rate limited | Respect Retry-After header; implement exponential backoff |
| Duplicate results in list | Re-using stale continuationToken | Always use the token from the most recent response |
OperationNotSupportedForItem | Wrong item type | Verify item is type Dataflow via Get Item |
Examples
Example 1: List All Dataflows in a Workspace
Prompt: "List all dataflows in this workspace."
Agent response:
az rest --method get \
--url "https://api.fabric.microsoft.com/v1/workspaces/${WS_ID}/dataflows" \
--resource "https://api.fabric.microsoft.com" \
--query "value[].{Name:displayName, Id:id, Type:type}" -o table
Example 2: Decode a Dataflow Definition
Prompt: "Show me the Power Query M code inside this dataflow."
Agent response:
# Step 1: Request definition (POST returns 202 with Location header).
# az rest cannot expose response headers, so capture Location via curl + az-acquired token.
TOKEN=$(az account get-access-token --resource "https://api.fabric.microsoft.com" --query accessToken -o tsv)
LOCATION=$(curl -sS -X POST \
-H "Authorization: Bearer $TOKEN" -H "Content-Length: 0" \
"https://api.fabric.microsoft.com/v1/workspaces/${WS_ID}/dataflows/${DF_ID}/getDefinition" \
-o /dev/null -D - | tr -d '\r' | grep -i "^location:" | awk '{print $2}')
# Step 2: Poll until definition is ready
DEF=$(az rest --method get --url "${LOCATION}" \
--resource "https://api.fabric.microsoft.com")
# Step 3: Decode mashup.pq to see the Power Query M code
echo "$DEF" | python3 -c "
import json, base64, sys
parts = json.load(sys.stdin)['definition']['parts']
for p in parts:
if p['path'] == 'mashup.pq':
print(base64.b64decode(p['payload']).decode('utf-8'))
"
Example 3: Check Refresh Job History
Prompt: "Show me the recent refresh history for this dataflow."
Agent response:
# Get recent job instances for a dataflow
az rest --method get \
--url "https://api.fabric.microsoft.com/v1/workspaces/${WS_ID}/dataflows/${DF_ID}/jobs/instances?limit=5" \
--resource "https://api.fabric.microsoft.com" \
--query "value[].{Status:status, Start:startTimeUtc, End:endTimeUtc, Id:id}" -o table
Example 4: Discover Parameters from Definition
Prompt: "What parameters does this dataflow accept?"
Agent response:
# After decoding the definition (see Example 2), extract parameters:
echo "$DEF" | python3 -c "
import json, base64, sys
parts = json.load(sys.stdin)['definition']['parts']
for p in parts:
if p['path'] == 'queryMetadata.json':
meta = json.loads(base64.b64decode(p['payload']).decode('utf-8'))
for qname, qmeta in meta.get('queriesMetadata', {}).items():
if qmeta.get('queryGroupId') == 'parameters' or 'IsParameterQuery' in str(qmeta):
print(f'Parameter: {qname}')
"
Query Evaluation
Execute an individual query from a dataflow and inspect results. Responses are a raw Apache Arrow IPC stream with Content-Type: application/vnd.apache.arrow.stream — not a JSON envelope. The first four bytes of a valid stream are the IPC continuation marker ff ff ff ff. Parse with pyarrow.ipc.open_stream().
Wire format:
executeQueryreturns the raw Apache Arrow IPC byte stream (Content-Type: application/vnd.apache.arrow.stream) — not JSON. Don't try to parse it withjq— there is no JSON envelope to extract. Use--output-fileto save the bytes and parse as Arrow (see Examples 5–7).
Failures return HTTP 200:
executeQueryreturns200 OKwithapplication/vnd.apache.arrow.streameven when the underlying source query fails (Kusto SEM0100, T-SQL syntax error, missing column, etc.). The error is embedded inside the stream'sPQ Arrow Metadatasection as{"Error":"..."}— see dataflows-authoring-cli § mashup-preview.md → Detecting failures inside the Arrow body for detector snippets. Naive HTTP-status checks will treat failures as success.
Intent split (canonical executeQuery reference is mashup-preview.md): the same
executeQueryendpoint serves two distinct intents. This skill covers the consumption intents:
- (a) Execute a persisted query — body
{"QueryName":"<saved-shared>"}only (nocustomMashupDocument).- (b) Ad-hoc read-only
customMashupDocument— preview a candidatesection Section1; ...document with no intent to persist viaupdateDefinition(Example 7).If you intend to persist the M, use
dataflows-authoring-cli§ Workflow C (Preview-Driven Authoring Loop) — it adds the bootstrap-bind rule (chicken-and-egg connection binding for new credentialed dataflows), auto-wrap rule, hard-avoid for unbounded preview, and the post-preview persistence steps.
Auto-wrap caveat: The Fabric REST API expects
customMashupDocumentto be a completesection Section1; ... shared X = ...;document. Rawlet ... in ...expressions are not auto-wrapped server-side — send a full section document and ensure theQueryNamerequest field matches asharedmember declared inside it.
Body shape: send a flat object with a top-level
QueryName(field name is case-insensitive on the wire; PascalCase canonical). The{"queries":[{...}]}array shape always returns400 DataflowExecuteQueryError: Invalid query nameregardless of inner casing. A wrongQueryNamevalue returnsQueryNotFound(different error code). See dataflows-authoring-cli § mashup-preview.md → Request body.
TimedOutrecovery for heavy persisted queries: if the persistedshared <Q>member is too expensive for the server-sideexecuteQuerybudget (~90s observed; surfaces as400 DataflowExecuteQueryError→InternalErrorCode: EvaluationError, Message: Evaluation result error code: TimedOut), do not retry the same body — it is deterministic. Instead, re-issue withcustomMashupDocumentset to the full persistedmashup.pq(decoded via Example 2) plus an appended probeshared __PreviewCap = Table.FirstN(<Q>, 10);andQueryName: "__PreviewCap". The cap pushes down for most connectors (Kusto, SQL, OData, web tables) so the source returns only the sample. See dataflows-authoring-cli § mashup-preview.md → ObservedInternalErrorCodevalues.
Prerequisites
- Dataflow must exist (verify via Example 1 above)
- Query name must exist in the dataflow (verify in mashup.pq via Example 2)
- User must have Contributor role or higher (read+write) on the dataflow
- Optional: Query parameters and custom M code modifications
Example 5: Execute a Query and Save Results
Prompt: "Execute the SalesData query in this dataflow and save the results."
Agent response:
# Step 1: Identify the query to execute (must be a `shared` member of the dataflow's mashup)
QUERY_NAME="SalesData" # Replace with your query name
# Step 2: Build the request body. To execute the persisted query as-is, send ONLY
# QueryName — omit customMashupDocument. (Do NOT pass a self-referential
# `shared X = let Source = #shared[X] in Source` — that recurses on itself
# inside the candidate document.) For custom M, see Example 7.
jq -n --arg q "$QUERY_NAME" '{QueryName: $q}' > req.json
# Step 3: Execute the query (raw Apache Arrow IPC stream is written directly to disk)
az rest --method post \
--url "https://api.fabric.microsoft.com/v1/workspaces/${WS_ID}/dataflows/${DF_ID}/executeQuery" \
--resource "https://api.fabric.microsoft.com" \
--body "@req.json" \
--output-file "${QUERY_NAME}_results.arrow"
# Step 4: Check for embedded source errors (HTTP 200 does NOT mean success)
if grep -q '"Error":"' "${QUERY_NAME}_results.arrow"; then
echo "Query execution failed (embedded in Arrow body):"
python3 -c "import re,sys; raw=open(sys.argv[1],'rb').read().decode('utf-8','replace'); m=re.search(r'\\{\"Error\":\"[^\"]+\"\\}', raw); print(m.group(0) if m else '(marker present, JSON not parsed)')" "${QUERY_NAME}_results.arrow"
exit 1
fi
echo "Query results saved to ${QUERY_NAME}_results.arrow"
Why
--output-file? Without it,az restdumps raw bytes to stdout and prints the warningNot a json response, outputting to stdout. For binary data suggest use "--output-file" to write to a file. Always pass--output-fileforexecuteQueryso the Arrow IPC stream is captured cleanly. Thejq -r '.data' | base64 --decodepattern from earlier docs is incorrect foraz rest— there is no JSON envelope to extract.
Example 5b: Render Query Results as a Markdown Table
Prompt: "Show me the first 10 rows of those results."
Agent response — read the saved Arrow file and render head(N) as a markdown table in chat. Always do this after executeQuery (default N=10). The visual preview catches silent-success bugs that the embedded-error detector misses: filter dropped all rows, wrong column referenced, wrong cast, off-by-one filter, join key typo.
N=${N:-10}; ARROW_FILE="${QUERY_NAME}_results.arrow"
python3 - <<EOF
import json, sys, pyarrow as pa, pyarrow.ipc as ipc
with open("$ARROW_FILE", "rb") as f:
try:
reader = ipc.open_stream(f)
except pa.lib.ArrowInvalid:
f.seek(0); reader = ipc.open_file(f)
# Surface any embedded engine error before rendering (mirrors mashup-preview.md § Error handling — A)
for v in (reader.schema.metadata or {}).values():
s = v.decode("utf-8", errors="replace")
if '"Error"' in s:
try:
msg = json.loads(s)["Error"]
except (ValueError, KeyError):
msg = s[:200]
sys.exit(f"Preview failed: {msg}")
table = reader.read_all()
rows, cols = table.num_rows, table.num_columns
print(f"**{rows} rows × {cols} columns** · {', '.join(table.schema.names)}\n")
if rows == 0:
sys.exit(0)
# Slice at the Arrow level before converting — avoids materialising the whole result.
df = table.slice(0, $N).to_pandas().copy()
# Truncate long string cells so the chat table stays readable.
for c in df.select_dtypes(include=["object", "string"]).columns:
df[c] = df[c].astype(str).str.slice(0, 50)
try:
print(df.to_markdown(index=False)) # requires the `tabulate` package
except ImportError:
print(df.to_string(index=False)) # fallback: fixed-width text
EOF
Optional dep:
pandas.DataFrame.to_markdown()requires thetabulatepackage — install once via your environment's standard Python tooling, the same way you installedpyarrow/pandasfor Example 6. Iftabulateis absent, the snippet falls back to fixed-widthto_string()so rendering still works.
When to skip rendering: render
head(N)by default. Skip only when (a) the user explicitly asked for the raw Arrow file with no preview, (b) the result row count exceeds ~1000 — in that case still renderhead(N)and add a one-line note about the total row count, or (c) the result is being streamed straight into another tool that will consume the Arrow file directly. In every other case, render — the saved file alone is not visible to the user in chat.
Example 6: Convert Arrow Results to CSV
Prompt: "Convert my Arrow query results to CSV so I can open them in Excel."
Agent response:
Prerequisite (one-time client-side setup): This example uses
pyarrowto decode the Arrow IPC stream andpandasto write CSV. Install them once via your environment's standard Python tooling. These are local client dependencies, not part of the Fabric recipe.
python3 <<EOF
import pyarrow as pa
import pandas as pd
import sys
ARROW_FILE = "${QUERY_NAME}_results.arrow"
CSV_FILE = "${QUERY_NAME}_results.csv"
# Open the Arrow IPC stream directly (raw on the wire, no envelope)
with open(ARROW_FILE, "rb") as f:
reader = pa.ipc.open_stream(f)
# Defensive: surface any error embedded in the stream's PQ Arrow Metadata
md = reader.schema.metadata or {}
for k, v in md.items():
s = v.decode("utf-8", errors="replace")
if '"Error"' in s:
print(f"Preview failed: {s}", file=sys.stderr)
sys.exit(1)
table = reader.read_all()
# Convert to pandas and export as CSV
df = table.to_pandas()
df.to_csv(CSV_FILE, index=False)
print(f"Converted {len(df)} rows to CSV")
print("Columns:", list(df.columns))
EOF
Example 7: Query with Custom M Code
Prompt: "Run a one-off ad-hoc M query against this dataflow without saving it."
Intent: ad-hoc read-only execution. The
customMashupDocumentis not persisted. If you intend to save the M viaupdateDefinition, usedataflows-authoring-cli§ Workflow C instead — it adds bootstrap-bind, auto-wrap, and post-preview persistence rules.
Agent response:
# Execute a query with custom M code (e.g., filter, aggregate, transform).
# The customMashupDocument must be a complete `section` document; az rest does NOT auto-wrap raw expressions.
CUSTOM_M='section Section1;
shared CustomQuery = let
Source = Table.FromRecords({[id=1, name="Alice"], [id=2, name="Bob"]}),
Filtered = Table.SelectRows(Source, each [id] > 0)
in
Filtered;'
jq -n --arg m "$CUSTOM_M" '{QueryName: "CustomQuery", customMashupDocument: $m}' > req.json
az rest --method post \
--url "https://api.fabric.microsoft.com/v1/workspaces/${WS_ID}/dataflows/${DF_ID}/executeQuery" \
--resource "https://api.fabric.microsoft.com" \
--body "@req.json" \
--output-file custom_results.arrow
# Always check for embedded errors before treating the file as a success
if grep -q '"Error":"' custom_results.arrow; then
echo "Custom query failed; inspect custom_results.arrow for the embedded {\"Error\":...} block."
exit 1
fi
Output Expectations
When this skill completes a task, the agent should return:
| Field | Convention |
|---|---|
| Verbosity | Concise summary (3–10 lines) for status; markdown table for list/inspect responses. |
| Default format | Markdown table for list-style queries; fenced JSON code block for single-resource responses; raw decoded mashup.pq in a fenced ```m block. For executeQuery: save the full Arrow stream to file and render head(N) (default N=10) as a markdown table in chat — see Example 5b. Suppress rendering only on explicit user request, when rows > 1000 (render head + total-count note), or when the result is being streamed into another tool. |
| Side-effect disclosure | This is a read-only skill — never imply mutation. |
| Verification | Include the source URL (e.g., the az rest --url value) in the response so the user can reproduce the call. |
| Error surfacing | If executeQuery returns Arrow with embedded {"Error":"..."}, surface the error verbatim and do not present partial results as success. |