google-agents-cli-workflow

作者: google

此技能應在使用者想要「開發代理」、「使用 ADK 建置代理」、「在本機執行代理」、「除錯代理程式碼」、「測試代理」、「部署代理」、「發布代理」、「監控代理」,或需要 ADK(代理開發套件)開發生命週期與編碼指南時使用。為建置 ADK 代理的進入點。始終啟用 — 提供完整工作流程(建立框架、建置、評估、部署、發布、監控)、程式碼保留規則、模型選擇指引,以及...

npx skills add https://github.com/google/agents-cli --skill google-agents-cli-workflow

ADK Development Workflow & Guidelines

STOP — Do NOT write code yet. If no project exists, scaffold first with agents-cli scaffold create <name>. If the user already has code, use agents-cli scaffold enhance . to add the agents-cli structure. Run agents-cli info to check if a project already exists. Skipping this leads to missing eval boilerplate, CI/CD config, and project conventions.

agents-cli is a CLI and skills toolkit for building, evaluating, and deploying agents on Google Cloud using the Agent Development Kit (ADK). It works with any coding agent — Gemini CLI, Claude Code, Codex, or others. Install with uvx google-agents-cli setup.

Requires: google-agents-cli = 0.5.0 If version is behind, run: uv tool install "google-agents-cli=0.5.0"

Check version: agents-cli info Install uv first if needed.

Session Continuity & Skill Cross-References

Re-read the relevant skill before each phase — not after you've already started and hit a problem. Context compaction may have dropped earlier skill content. If skills are not available, run uvx google-agents-cli setup to install them.

PhaseSkillWhen to load
0 — UnderstandNo skill needed — read .agents-cli-spec.md if present, else clarify goals with the user
1 — Study samplesCheck Notable Samples table below — clone and study matching samples before scaffolding
2 — Scaffold/google-agents-cli-scaffoldBefore creating or enhancing a project
3 — Build/google-agents-cli-adk-codeBefore writing agent code — API patterns, tools, callbacks, state
4 — Evaluate/google-agents-cli-evalBefore running any eval — dataset schema, metrics, eval-fix loop
5 — Deploy/google-agents-cli-deployBefore deploying — target selection, troubleshooting 403/timeouts
6 — Publish/google-agents-cli-publishAfter deploying, if registering with Gemini Enterprise (optional)
7 — Observe/google-agents-cli-observabilityAfter deploying — traces, logging, monitoring setup

Setup

If agents-cli is not installed:

uv tool install google-agents-cli

uv command not found

Install uv following the official installation guide.

Product name mapping

The platform formerly known as "Vertex AI" is now Gemini Enterprise Agent Platform (short: Agent Platform). Users may refer to products by different names. Map them to the correct CLI values:

User may sayCLI value
Agent Engine, Vertex AI Agent Engine, Agent Runtime--deployment-target agent_runtime
Vertex AI Search, Agent Search--datastore agent_platform_search
Vertex AI Vector Search, Vector Search--datastore agent_platform_vector_search
Agent Engine sessions, Agent Platform Sessions--session-type agent_platform_sessions

The vertexai Python SDK package name is unchanged.


Phase 0: Understand

Before writing or scaffolding anything, understand what you're building.

If .agents-cli-spec.md exists in the current directory, read it — it is your primary source of truth. Otherwise:

Do NOT proceed to planning, scaffolding, or coding. Ask the user the questions below and wait for their answers. You MUST have the user's answers before moving on. Do not assume, research, or fill in the blanks yourself. The user's intent drives everything — skipping this step leads to wasted work.

Always ask:

  1. What problem will the agent solve? — Core purpose and capabilities
  2. External APIs or data sources needed? — Tools, integrations, auth requirements
  3. Safety constraints? — What the agent must NOT do, guardrails
  4. Deployment preference? — Prototype first (recommended) or full deployment? If deploying: Agent Runtime, Cloud Run, or GKE?

Ask based on context:

  • If retrieval or search over data mentioned (RAG, semantic search, vector search, embeddings, similarity search, data ingestion) → Datastore? Options: agent_platform_vector_search (embeddings, similarity search) or agent_platform_search (document search, search engine).
  • If agent should be available to other agentsA2A protocol? Enables the agent as an A2A-compatible service.
  • If full deployment chosen → CI/CD runner? GitHub Actions (default) or Google Cloud Build?
  • If agent should remember user preferences or facts across sessionsMemory Bank? Long-term memory across conversations. See /google-agents-cli-adk-code.
  • If Cloud Run or GKE chosen → Session storage? In-memory (default), Cloud SQL (persistent), or Agent Platform Sessions (managed).
  • If deployment with CI/CD chosen → Git repository? Does one already exist, or should one be created? If creating, public or private?

Once you have the user's answers, write the spec to .agents-cli-spec.md in the current directory and get the user's approval. See /google-agents-cli-scaffold for how these choices map to CLI flags. At minimum include these sections — expand with more detail if the user wants a thorough spec:

# Agent Spec

## Overview
Describe the agent's purpose and how it works.

## Example Use Cases
Concrete examples with expected inputs and outputs.

## Tools Required
Each tool with its purpose, API details, and authentication needs.

## Constraints & Safety Rules
Specific rules — not just generic statements.

## Success Criteria
Measurable outcomes for evaluation.

## Reference Samples
Check the Notable Samples in Phase 1 — list any that match this use case.

Optional sections for more detailed specs: Edge Cases to Handle, Architecture & Sub-Agents, Data Sources & Auth, Non-Functional Requirements.

Once you have a clear understanding, proceed to Phase 1.

Phase 1: Study Reference Samples

Ask yourself: is there a sample that can help me design this and cut time? Scan the keywords below. Multiple samples can match — clone and study all that are relevant.

# Clone a sample to study — read the key files, understand the patterns, then apply
# them to your own scaffolded project. Do NOT use `adk@<sample>` scaffolding.
git clone --filter=tree:0 --sparse https://github.com/google/adk-samples /tmp/adk-samples 2>/dev/null; \
cd /tmp/adk-samples && git sparse-checkout add python/agents/<sample-name>
  • ambient-expense-agent — Agent that runs on a schedule or reacts to events, with no interactive user. Keywords: scheduled, cron, daily, pubsub, event-driven, alerts, email, ambient Key files: expense_agent/fast_api_app.py, expense_agent/agent.py, expense_agent/config.py, terraform/
  • adk-ae-oauth — Agent with OAuth 2.0 user consent, deployed to Agent Runtime with Gemini Enterprise. Keywords: OAuth, authentication, user consent, Google Drive, Agent Runtime, Gemini Enterprise Key files: README.md, adk_ae_oauth/tools.py, adk_ae_oauth/auths.py
  • genmedia-for-commerce — Full-stack agent with React UI, MCP tools, media/image handling, and Gemini Enterprise registration. Keywords: MCP, media, video generation, Veo, virtual try-on, retail, full-stack, React, Gemini Enterprise Key files: genmedia4commerce/agent.py, genmedia4commerce/agent_utils.py, genmedia4commerce/fast_api_app.py
  • deep-search — Research agent that iterates until quality is met, with source citations. Keywords: research, citations, iterative, grounding, multi-agent, human-in-the-loop, web search, report Key files: app/agent.py, app/config.py
  • safety-plugins — Reusable safety guardrails that plug into any agent runner. Keywords: safety, guardrails, model armor, filters Key files: safety_plugins/plugins/model_armor.py, safety_plugins/plugins/agent_as_a_judge.py, safety_plugins/main.py
  • data-science — Agent that executes code in a managed sandbox for data analysis. Keywords: SQL, BigQuery, code execution, sandbox Key files: data_science/sub_agents/analytics/agent.py
  • memory-bank — Conversational agent with cross-session memory via Memory Bank (Cloud Run and Agent Runtime). Keywords: memory, cross-session, recall, context, remember, Memory Bank Key files: app/agent.py, app/agent_runtime_app.py, app/fast_api_app.py

If no sample matches, proceed to Phase 2. But first — are you sure? Re-read the user's request and compare it against the keywords above. Skipping a matching sample means rebuilding patterns that already exist.

IMPORTANT — Exit criteria: After studying a sample, ask yourself: can I apply anything from this sample to help me deliver the design? Note what you'll reuse before moving on. Do NOT proceed until you've answered this.

This list is useful at any phase — revisit it when you hit deployment, publishing, or infrastructure questions. A sample's Terraform or registration pattern may be exactly what you need later.

Phase 2: Scaffold (if needed)

Use /google-agents-cli-scaffold to create a new project or import an existing one into the agents-cli format (adding deployment, CI/CD, infrastructure). It covers architecture choices (deployment target, agent type, session storage) and project creation or enhancement.

Skip this phase if the project was already created or enhanced by agents-cli — run agents-cli info from the project root to check.

Phase 3: Build and Implement

Implement the agent logic:

  1. Write/modify code in the agent directory (check GEMINI.md / CLAUDE.md for directory name)
  2. Quick smoke test: Use agents-cli run "your prompt" to verify the agent works after changes — this is the fastest way to check behavior without leaving the terminal
  3. Iterate on the implementation based on user feedback

If the user asks for interactive testing, suggest agents-cli playground — it opens a web-based playground for manual conversation with the agent.

For ADK API patterns and code examples, use /google-agents-cli-adk-code.

NEVER write pytest tests that assert on LLM output content (e.g., checking for keywords in responses, verifying persona, validating tone). LLM outputs are non-deterministic — these tests are flaky by nature and belong in eval, not pytest. Use agents-cli run for quick checks and agents-cli eval generate followed by agents-cli eval grade for systematic validation.

Phase 3.5: Provision Datastore (RAG projects only)

For agentic_rag projects, provision the datastore before testing: agents-cli infra datastore, then agents-cli data-ingestion. Use infra datastorenot infra single-project (same datastore provisioning but faster, skips unrelated Terraform).

Phase 4: Evaluate

This is the most important phase. Evaluation validates agent behavior end-to-end.

MANDATORY: Activate /google-agents-cli-eval before running evaluation. It contains the dataset schema, config format, and critical gotchas. Do NOT skip this.

Do NOT skip this phase. After building the agent, you MUST proceed to evaluation. Do NOT write pytest tests to validate agent behavior — that is what eval is for.

uv run pytest vs agents-cli eval — know the difference:

  • uv run pytest — Tests code correctness: imports work, functions return expected types, API contracts hold. Does NOT test whether the agent behaves well.
  • agents-cli eval — Tests agent behavior: response quality, tool usage, persona consistency, safety compliance. This is what validates your agent actually works.
  • agents-cli run "prompt" — Quick one-off smoke test during development. If testing multiple prompts use the --start-server option to persist the local server, which reduces overhead for repeated calls and allows resuming local sessions via --session-id. Use this for fast iteration, not pytest.

NEVER write pytest tests that check LLM response content (e.g., asserting pirate keywords appear, checking if the agent mentions allergies). LLM outputs are non-deterministic. Use eval with LLM-as-judge criteria instead.

  1. Start small: Begin with 1-2 sample eval cases, not a full suite
  2. Run evaluations: agents-cli eval run (chains generate + grade). For debugging or custom trace locations, use the two-step form: agents-cli eval generate then agents-cli eval grade.
  3. Discuss results with the user
  4. Fix issues and iterate on the core cases first
  5. Only after core cases pass, add edge cases and new scenarios
  6. Repeat until quality thresholds are met

Expect 5-10+ iterations here.

Phase 5: Deploy

Once evaluation thresholds are met:

  1. Check if the project has a deployment target configured — run agents-cli info to see current config
  2. If the project is a prototype (no deployment target), add deployment support first:
    agents-cli scaffold enhance . --deployment-target <target>
    
    See /google-agents-cli-deploy for the deployment target decision matrix (Agent Runtime vs Cloud Run vs GKE).
  3. Deploy when ready: agents-cli deploy

IMPORTANT: Never deploy without explicit human approval.

Phase 6: Publish (optional)

Not all agents require this — currently supporting Gemini Enterprise. See /google-agents-cli-publish for registration modes, flags, and troubleshooting.

Phase 7: Observe

After deploying, use observability tools to monitor agent behavior in production. See /google-agents-cli-observability for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.


Operational Guidelines for Coding Agents

Common Shortcuts to Resist

Agents routinely skip steps with plausible-sounding excuses. Recognize these and push back:

ShortcutWhy it fails
"The user's request is clear enough, no need to clarify"You're guessing at requirements. Phase 0 exists to confirm intent before scaffolding — even one question can prevent a full rework.
"The agent responded correctly in agents-cli run, so eval isn't needed"One prompt is not a test suite. Eval catches regressions, edge cases, and tool trajectory issues that a single run never will.
"I'll use a newer/better model"The scaffolded model was chosen deliberately. Changing it without being asked violates code preservation (Principle 1) and often breaks things — wrong location, deprecated version, or 404. Your training data is likely out of date — rely on the skills and the model listing command, not your knowledge of model names.
"I can skip the scaffold and set up manually"Manual setup misses eval boilerplate, CI/CD config, and project configuration manifest conventions. Use agents-cli create even for quick experiments.

Principle 1: Code Preservation & Isolation

Code modifications require surgical precision — alter only the code segments directly targeted by the user's request and strictly preserve all surrounding and unrelated code.

Mandatory Pre-Execution Verification:

Before finalizing any code replacement, verify the following:

  1. Target Identification: Clearly define the exact lines or expressions to change, based solely on the user's explicit instructions.
  2. Preservation Check: Confirm that all code, configuration values (e.g., model, version, api_key), comments, and formatting outside the identified target remain identical.

Example:

  • User Request: "Change the agent's instruction to be a recipe suggester."
  • Incorrect (VIOLATION):
    root_agent = Agent(
        name="recipe_suggester",
        model="gemini-1.5-flash",  # UNINTENDED - model was not requested to change
        instruction="You are a recipe suggester."
    )
    
  • Correct (COMPLIANT):
    root_agent = Agent(
        name="recipe_suggester",  # OK, related to new purpose
        model="gemini-flash-latest",  # PRESERVED
        instruction="You are a recipe suggester."  # OK, the direct target
    )
    

Principle 2: Execution Best Practices

  • Model Selection — CRITICAL:

    • NEVER change the model unless explicitly asked.
    • When creating NEW agents (not modifying existing), use the latest Gemini model. List available models to pick the newest one:
      # Use 'global' or any supported region (e.g. 'us-east1')
      uv run --with google-genai python -c "
      from google import genai
      client = genai.Client(vertexai=True, location='global')
      for m in client.models.list(): print(m.name)
      "
      
    • Do NOT use older models unless explicitly requested. For model docs, fetch https://adk.dev/agents/models/google-gemini/index.md. See also stable model versions.
  • Running Python Commands:

    • Always use uv to execute Python commands (e.g., uv run python script.py)
    • Run uv sync before executing scripts
  • Breaking Infinite Loops:

    • Stop immediately if you see the same error 3+ times in a row
    • RED FLAGS: Lock IDs incrementing, names appending v5→v6→v7, "I'll try one more time" repeatedly
    • State conflicts (Error 409): Use terraform import instead of retrying creation
    • When stuck: Run underlying commands directly (e.g., terraform CLI)
  • Troubleshooting:

    • Check /google-agents-cli-adk-code first — it covers most common patterns
    • Use WebFetch on URLs from the ADK docs index (curl https://adk.dev/llms.txt) for deep dives
    • When encountering persistent errors, a targeted web search often finds solutions faster
    • CLI command failures: run agents-cli <command> --help — the output ends with a Source: line pointing to the exact source file implementing that command. Read it to understand the logic and diagnose failures. Use agents-cli info to get the full CLI install path if you need to browse across multiple files.

Systematic Debugging

When something breaks, follow this sequence — don't skip steps or shotgun fixes:

  1. Reproduce — Run the exact command that failed. Save the full error output. If you can't reproduce it, you can't fix it.
  2. Localize — Narrow the cause: is it the agent code, a tool, the config, or the environment? Use agents-cli run "prompt" to isolate agent behavior from deployment issues. Add -v (--verbose) to print the full JSON event payloads — useful for inspecting tool calls, intermediate steps, and silent failures.
  3. Fix one thing — Change one variable at a time. If you change the instruction AND the tool AND the config simultaneously, you won't know what fixed it (or what broke something else).
  4. Verify — Rerun the exact reproduction command. Don't assume the fix worked.
  5. Guard — If it was a non-obvious bug, add an eval case to catch regressions.

Stop-the-line rule: If a change breaks something that was working, stop feature work and fix the regression first. Don't push forward hoping to circle back — regressions compound.

  • Environment Variables:
    • .env files and env var assignments (e.g., GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION) are typically required for the agent to function — never remove or modify them unless the user explicitly asks
    • If a .env file exists in the project root, treat it as essential configuration
    • For secrets and API keys, prefer GCP Secret Manager over plain .env entries — see /google-agents-cli-deploy for secret management guidance

Using a Temporary Scaffold as Reference

When you need specific infrastructure files (Terraform, CI/CD, Dockerfile) but don't want to modify the current project, use /google-agents-cli-scaffold to create a temporary project in /tmp/ and copy over what you need.


Reference Files

FileContents
references/internals.mdUnderlying tools and commands that agents-cli wraps (adk, pytest, ruff, uvicorn)

Development Commands

Setup & Skills

CommandPurpose
agents-cli setupInstall skills to coding agents
agents-cli setup --skip-authInstall skills, skip authentication step
agents-cli setup --dry-runPreview what setup would do without executing
agents-cli updateReinstall/update skills to latest version

Scaffolding

CommandPurpose
agents-cli scaffold create <name>Create a new project
agents-cli scaffold enhance .Add deployment / CI-CD to project
agents-cli scaffold upgradeUpgrade project to newer agents-cli version

Development

CommandPurpose
agents-cli playgroundInteractive local testing (ADK web playground)
agents-cli run "prompt"Run agent with a single prompt (non-interactive). Add -v for full JSON event payloads.
agents-cli lintCheck code quality
agents-cli lint --fixAuto-fix linting issues
agents-cli lint --mypyAlso run mypy type checking
agents-cli installInstall project dependencies (uv sync)

Evaluation

CommandPurpose
agents-cli eval dataset synthesizeSynthesize multi-turn eval scenarios for your agent (cold-start a dataset)
agents-cli eval generateRun agent inference over the default dataset, produce traces
agents-cli eval generate --dataset PATHRun inference for a specific dataset
agents-cli eval gradeGrade traces with the metrics in eval_config.yaml
agents-cli eval grade --metrics METRICGrade with a specific metric (overrides eval_config.yaml)
agents-cli eval metric listList built-in metrics available in the SDK
agents-cli eval compare BASE CANDCompare two grade-results files (regression check)
agents-cli eval analyze --eval-result RESULTSCluster failure modes from a grade-results file
agents-cli eval optimizeAuto-tune agent prompts using eval data
agents-cli eval submit --dataset D --dest gs://BUCKETSubmit a managed cloud-side eval run on the Vertex AI Eval Service
agents-cli eval results --run-id IDFetch status/results of a submitted cloud eval run

Deployment & Infrastructure

CommandPurpose
agents-cli deployDeploy to dev (requires human approval)
agents-cli infra single-projectProvision single-project GCP infrastructure without CI/CD (Terraform, optional)
agents-cli infra cicdSet up CI/CD pipeline + staging/prod infrastructure
agents-cli publish gemini-enterpriseRegister agent with Gemini Enterprise

Project Info

CommandPurpose
agents-cli infoShow CLI install path, skills location, and project config

Use agents-cli info to discover the CLI install path — this is where the CLI source code lives. Read files under that path to understand CLI internals, command implementations, or template logic. The command only shows project details when run inside a generated agent project (i.e., one with agents-cli-manifest.yaml in the project root directory).

Authentication

CommandPurpose
agents-cli login --interactiveAuthenticate with Google for ADK services (-i / --interactive is required for interactive browser-based authentication)
agents-cli login --statusShow authentication status

[!NOTE] When using an API key to authenticate, the login command does not persist them automatically, it just aids in retrieving them and providing instructions on how they can be persisted.


Skills Version

Troubleshooting hint: If skills seem outdated or incomplete, reinstall:

agents-cli setup --skip-auth

Only do this when you suspect stale skills are causing problems.


Related Skills

  • /google-agents-cli-scaffold — Project creation, requirements gathering, and enhancement
  • /google-agents-cli-adk-code — ADK Python API quick reference and production sample agents
  • /google-agents-cli-eval — Evaluation methodology, dataset schema, and the eval-fix loop
  • /google-agents-cli-deploy — Deployment targets, CI/CD pipelines, and production workflows
  • /google-agents-cli-publish — Gemini Enterprise registration
  • /google-agents-cli-observability — Cloud Trace, logging, BigQuery Analytics, and third-party integrations

來自 google 的更多技能

google-agents-cli-adk-code
google
當使用者想要「撰寫代理程式碼」、「使用 ADK 建置代理程式」、「新增工具」、「建立回呼」、「定義代理程式」、「使用狀態管理」,或需要 ADK(代理程式開發套件)Python API 模式與程式碼範例時,應使用此技能。屬於 Google ADK 技能套件的一部分。提供代理程式類型、工具定義、編排模式、回呼與狀態管理的快速參考。請勿用於建立新專案(請使用 google-agents-cli-scaffold)或部署...
developmentapicode-review
google-agents-cli-eval
google
當使用者想要「執行評估」、「評估我的 ADK 代理」、「撰寫評估資料集」、「分析評估失敗原因」、「比較評估結果」、「優化代理」,或需要 Agent Platform 評估方法論與品質飛輪的指引時,應使用此技能。涵蓋評估指標、資料集結構、LLM 作為評審的評分方式,以及常見的失敗原因。請勿用於 API 程式碼模式(請使用 google-agents-cli-adk-code)、部署(請使用 google-agents-cli-deploy)或專案架構生成(請使用...
developmenttestingdata-analysis
google-agents-cli-deploy
google
當使用者想要「部署代理程式」、「部署我的ADK代理程式」、「設定CI/CD」、「設定機密」、「疑難排解部署」,或需要關於Agent Runtime、Cloud Run或GKE部署目標的指引時,應使用此技能。涵蓋部署工作流程、服務帳戶、回滾及生產基礎架構。屬於Google ADK(代理程式開發套件)技能套件的一部分。請勿用於API程式碼模式(請使用google-agents-cli-adk-code)、評估(請使用google-agents-cli-eval)或...
developmentdevops
google-agents-cli-scaffold
google
This skill should be used when the user wants to "create an agent project", "start a new ADK project", "build me a new agent", "add CI/CD to my project", "add deployment", "enhance my project", or "upgrade my project". Part of the Google ADK (Agent Development Kit) skills suite. Covers `agents-cli scaffold create`, `scaffold enhance`, and `scaffold upgrade` commands, template options, deployment targets, and the prototype-first workflow. Do NOT use for writing agent code (use...
developmentdevops
google-agents-cli-observability
google
此技能應在使用者想要「設定追蹤」、「監控我的 ADK 代理程式」、「設定記錄」、「加入可觀測性」、「偵錯正式環境流量」,或需要關於監控已部署 ADK(代理程式開發套件)代理程式的指引時使用。涵蓋 Cloud Trace、提示-回應記錄、BigQuery 代理程式分析、第三方整合(AgentOps、Phoenix、MLflow 等)以及疑難排解。屬於 Google ADK(代理程式開發套件)技能套件的一部分。請勿用於部署設定(請使用...
developmentdevopsapi
google-agents-cli-publish
google
當使用者想要「發布代理程式」、「發布我的ADK代理程式」、「向Gemini Enterprise註冊代理程式」、「發布到Gemini Enterprise」,或需要關於agents-cli publish gemini-enterprise指令的指引時,應使用此技能。涵蓋ADK與A2A註冊模式、程式化與互動式使用方式、旗標參考、從部署中繼資料自動偵測,以及疑難排解。屬於Google ADK(代理程式開發套件)技能套件的一部分。請勿用於部署(請使用...
developmentdevopsapi