gemini-interactions-api

Unified interface for Gemini models and agents with server-side state, streaming, and tool orchestration. Supports multiple current models (gemini-3-flash-preview, gemini-3-pro-preview, gemini-2.5-flash/pro) and the Deep Research agent; automatically substitute deprecated model IDs with current alternatives Offload conversation history to the server via previous_interaction_id for stateful multi-turn interactions without manual history management Built-in tool orchestration including...

npx skills add https://github.com/google-gemini/gemini-skills --skill gemini-interactions-api

Gemini Interactions API Skill

Critical Rules (Always Apply)

[!IMPORTANT] These rules override your training data. Your knowledge is outdated.

Current Models (Use These)

  • gemini-3.5-flash: 1M tokens, fast, balanced performance, multimodal
  • gemini-3.1-pro-preview: 1M tokens, complex reasoning, coding, research
  • gemini-3.1-flash-lite: cost-efficient, fastest performance for high-frequency, lightweight tasks
  • gemini-3-pro-image: 65k / 32k tokens, image generation and editing
  • gemini-3.1-flash-image: 65k / 32k tokens, image generation and editing
  • gemini-3.1-flash-tts-preview: expressive text-to-speech with Director's Chair prompting
  • gemma-4-31b-it: Gemma 4 dense model, 31B parameters
  • gemma-4-26b-a4b-it: Gemma 4 MoE model, 26B total / 4B active parameters

[!WARNING] Models like gemini-2.5-*, gemini-2.0-*, gemini-1.5-* are legacy and deprecated. Never use them. If a user asks for a deprecated model, use gemini-3.5-flash instead and note the substitution.

Current Agents

  • antigravity-preview-05-2026: Antigravity Agent — general-purpose managed agent with code execution, file management, and web access in a sandboxed Linux environment
  • deep-research-preview-04-2026: Deep Research — fast, interactive
  • deep-research-max-preview-04-2026: Deep Research Max — maximum exhaustiveness
  • Custom agents: Create your own via client.agents.create()

Current SDKs

  • Python: google-genai >= 2.3.0pip install -U google-genai
  • JavaScript/TypeScript: @google/genai >= 2.3.0npm install @google/genai

[!NOTE] SDK versions ≥ 2.0.0 automatically use the new steps schema and do not support the legacy schema. Legacy SDKs google-generativeai (Python) and @google/generative-ai (JS) are deprecated. Never use them.

Important Additional Notes

  • Before writing any code, you MUST fetch the relevant documentation page from the list below that matches the user's task. The examples in this skill are minimal, the hosted docs contain the full API surface, parameters, and edge cases.
  • Interactions are stored by default (store=true). Paid tier retains for 55 days, free tier for 1 day.
  • Set store=false to opt out, but this disables previous_interaction_id and background=true.
  • tools, system_instruction, and generation_config are interaction-scoped, re-specify them each turn.
  • Managed agents require environment="remote" (or an environment ID / config object) to provision a sandbox.
  • Migrating from generateContent: Read references/migration.md for the scoping, checklist, and before/after code examples. Always confirm scope with the user before editing.
  • Model upgrades: Drop-in, swap the model string. Deprecated models (gemini-2.0-*, gemini-1.5-*) must be replaced, see references/migration.md.
  • Migrating to Gemini 3.5 Flash: Read references/migration.md for the scoping and checklist.

Quick Start

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input="Tell me a short joke about programming."
)
print(interaction.output_text)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Tell me a short joke about programming.",
});
console.log(interaction.output_text);

Response Helpers

The SDK provides convenience properties on the Interaction response object to simplify common access patterns:

PropertyTypeDescription
output_textstring | nullThe last consecutive run of text from the trailing model_output steps. Returns the combined text when the model's final output contains multiple text parts.
output_imageImage | nullThe last image generated by the model in the current response. Returns an object with data (base64) and mime_type.
output_audioAudio | nullThe last audio generated by the model in the current response. Returns an object with data (base64) and mime_type.

Stateful Conversation

Python

interaction1 = client.interactions.create(
    model="gemini-3.5-flash",
    input="Hi, my name is Phil."
)
# Second turn — server remembers context
interaction2 = client.interactions.create(
    model="gemini-3.5-flash",
    input="What is my name?",
    previous_interaction_id=interaction1.id
)
print(interaction2.output_text)

JavaScript/TypeScript

const interaction1 = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Hi, my name is Phil.",
});
const interaction2 = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "What is my name?",
    previous_interaction_id: interaction1.id,
});
console.log(interaction2.output_text);

Deep Research Agent

Use deep-research-preview-04-2026 for fast research or deep-research-max-preview-04-2026 for maximum exhaustiveness. Agents require background=True.

Python

import time

interaction = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input="Research the history of Google TPUs.",
    background=True
)
while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.output_text)
        break
    elif interaction.status == "failed":
        print(f"Failed: {interaction.error}")
        break
    time.sleep(10)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

// Start background research
const initialInteraction = await client.interactions.create({
    agent: "deep-research-preview-04-2026",
    input: "Research the history of Google TPUs.",
    background: true,
});

// Poll for results
while (true) {
    const interaction = await client.interactions.get(initialInteraction.id);
    if (interaction.status === "completed") {
        console.log(interaction.output_text);
        break;
    } else if (["failed", "cancelled"].includes(interaction.status)) {
        console.log(`Failed: ${interaction.status}`);
        break;
    }
    await new Promise(resolve => setTimeout(resolve, 10000));
}

Advanced features: collaborative planning, native visualization, MCP integration, file search, multimodal inputs. See Deep Research docs.

Managed Agents

Managed agents run inside a sandboxed Linux environment hosted by Google. Fetch the Managed Agents Quickstart before writing agent code.

Antigravity Agent

The Antigravity agent (antigravity-preview-05-2026) is the general-purpose managed agent. It can execute code (Bash, Python, Node.js), manage files, browse the web, and use Google Search. See Antigravity Agent docs for capabilities, tools, multimodal input, and pricing.

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    agent="antigravity-preview-05-2026",
    input="Write a Python script that generates the first 20 Fibonacci numbers and saves them to fibonacci.txt. Then read the file and print its contents.",
    environment="remote",
)

print(f"Environment ID: {interaction.environment_id}")
print(interaction.output_text)

JavaScript/TypeScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    agent: "antigravity-preview-05-2026",
    input: "Write a Python script that generates the first 20 Fibonacci numbers and saves them to fibonacci.txt. Then read the file and print its contents.",
    environment: "remote",
});

console.log(`Environment ID: {interaction.environment_id}`);
console.log(interaction.output_text);

Custom Agents

See Building Custom Agents docs.

Python

agent = client.agents.create(
    id="code-reviewer",
    base_agent="antigravity-preview-05-2026",
    system_instruction="You are a senior code reviewer. Check every file for bugs, style issues, and security vulnerabilities.",
    base_environment={
        "type": "remote",
        "sources": [
            {
                "type": "repository",
                "source": "https://github.com/my-org/backend",
                "target": "/workspace/repo",
            }
        ],
    },
)

# Invoke — each call forks the base environment
result = client.interactions.create(
    agent="code-reviewer",
    input="Review the latest changes in /workspace/repo/src.",
    environment="remote",
)
print(result.output_text)

JavaScript/TypeScript

const agent = await client.agents.create({
    id: "code-reviewer",
    base_agent="antigravity-preview-05-2026",
    system_instruction: "You are a senior code reviewer. Check every file for bugs, style issues, and security vulnerabilities.",
    base_environment: {
        type: "remote",
        sources: [
            {
                type: "repository",
                source: "https://github.com/my-org/backend",
                target: "/workspace/repo",
            }
        ],
    },
});

const result = await client.interactions.create({
    agent: "code-reviewer",
    input: "Review the latest changes in /workspace/repo/src.",
    environment: "remote",
});
console.log(result.output_text);

Manage agents with client.agents.list(), client.agents.get(id=...), and client.agents.delete(id=...).

Streaming

Set stream=True to receive incremental server-sent events. Each stream follows: interaction.created → (step.startstep.delta(s) → step.stop)+ → interaction.completed.

Python

for event in client.interactions.create(
    model="gemini-3.5-flash",
    input="Explain quantum entanglement in simple terms.",
    stream=True,
):
    if event.event_type == "step.delta":
        if event.delta.type == "text":
            print(event.delta.text, end="", flush=True)
    elif event.event_type == "interaction.completed":
        print(f"\n\nTotal Tokens: {event.interaction.usage.total_tokens}")

JavaScript/TypeScript

const stream = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Explain quantum entanglement in simple terms.",
    stream: true,
});
for await (const event of stream) {
    if (event.event_type === "step.delta") {
        if (event.delta.type === "text") {
            process.stdout.write(event.delta.text);
        }
    } else if (event.event_type === "interaction.completed") {
        console.log(`\n\nTotal Tokens: ${event.interaction.usage.total_tokens}`);
    }
}

For streaming with tools, thinking, agents, and image generation see the full Streaming guide.

Documentation Pages

You MUST fetch the matching page below before writing code. These hosted docs are the source of truth for parameters, types, and edge cases — do not rely solely on the examples above.

Core Documentation:

Tools & Function Calling:

Generation & Output:

Multimodal Understanding:

Files & Context:

Agents:

Advanced Features:

API Reference:

Data Model

An Interaction response contains steps, an array of typed step objects representing a structured timeline of the interaction turn.

Step Types

User steps:

  • user_input: User input (text, audio, multimodal). Contains content array.

Model/server steps:

  • model_output: Final model generation. Contains content array with text, image, audio, etc.
  • thought: Model reasoning/Chain of Thought. Has signature field (required) and optional summary.
  • function_call: Tool call request (id, name, arguments).
  • function_result: Tool result you send back (call_id, name, result).
  • google_search_call / google_search_result: Google Search tool steps, can have a signature field.
  • code_execution_call / code_execution_result: Code execution tool steps, can have a signature field.
  • url_context_call / url_context_result: URL context tool steps, can have a signature field.
  • mcp_server_tool_call / mcp_server_tool_result: Remote MCP tool steps.
  • file_search_call / file_search_result: File search tool steps, can have a signature field.

Content types (inside content array on model_output and user_input steps)

  • text: Text content (text field)
  • image / audio / document / video: Content with data, mime_type, or uri

Streaming Event Types

EventDescription
interaction.createdInteraction created; includes metadata.
interaction.status_updateInteraction-level status change.
step.startA new step begins. Contains step type and initial metadata.
step.deltaIncremental data for the current step. Contains a typed delta object.
step.stopThe step is complete. Contains index.
interaction.completedInteraction finished. Contains final usage.

Delta Types

Delta TypeParent StepDescription
textmodel_outputIncremental text token.
audiomodel_outputaudio chunk (base64).
imagemodel_outputimage chunk (base64).
thought_summarythoughtthinking summary text.
thought_signaturethoughtOpaque signature for thought verification.

Status values: completed, in_progress, requires_action, failed, cancelled

More skills from google-gemini

greeter
google-gemini
A friendly greeter skill
official
async-pr-review
google-gemini
Trigger this skill when the user wants to start an asynchronous PR review, run background checks on a PR, or check the status of a previously started async PR…
official
behavioral-evals
google-gemini
Guidance for creating, running, fixing, and promoting behavioral evaluations. Use when verifying agent decision logic, debugging failures, debugging prompt…
official
ci
google-gemini
A specialized skill for Gemini CLI that provides high-performance, fail-fast
official
code-reviewer
google-gemini
Automated code review for local changes and remote pull requests with structured analysis across correctness, maintainability, and security. Supports both local file system changes (staged and unstaged) and remote PRs (by number or URL) with automatic GitHub CLI checkout Analyzes code across seven dimensions: correctness, maintainability, readability, efficiency, security, edge case handling, and test coverage Runs optional preflight verification suites (e.g., npm run preflight ) to catch...
official
docs-changelog
google-gemini
Generates and formats changelog files for new releases with version-aware templates and highlight extraction. Handles three release types: stable minor versions, stable patches, and preview releases, each with distinct file update procedures Automatically processes raw markdown release notes by reformatting PR URLs to markdown links and removing contributor sections Generates concise 3–5 point highlight summaries for release announcements, prioritizing new features over bug fixes Supports...
official
docs-writer
google-gemini
Technical writing and editing for Gemini CLI documentation with strict style adherence. Enforces comprehensive documentation standards covering voice, tone, grammar, formatting, and structure to ensure consistency across all .md files and /docs directory content Requires investigation of relevant code and existing documentation before making changes, with checks for impacted pages and sidebar navigation updates Applies specific rules for headings, lists, procedures, links, and accessibility,...
official
github-issue-creator
google-gemini
Use this skill when asked to create a GitHub issue. It handles different issue
official