Use these docs in your AI agent

Copy the URL and paste it into ChatGPT, Claude, Cursor, or any LLM tool with web access. The full reference lives at one stable URL — your agent fetches it directly, you don't have to paste kilobytes of markdown.

Copy docs URLpixserp.com/docs.md

Quickstart

Pixserp is an OpenAI-compatible AI search API. Drop-in for the official openai SDK in any language — set base_url, pick a pixserp-* model, ship. Or use plain HTTP if you prefer curl.

A first call in under a minute. Point the SDK at https://pixserp.com/api/v1, send a user message, get back an answer with inline [1] citations and a structured message.citations array.

Create an API key from your dashboard.
Install the OpenAI SDK for your language (or skip and curl).
Run the snippet below.

PythonJavaScriptcurlGo

copy

from openai import OpenAI

client = OpenAI( api_key="pxs_…", base_url="https://pixserp.com/api/v1", )

r = client.chat.completions.create( model="pixserp-fast", messages=[{"role": "user", "content": "NYC congestion pricing 2026 update"}], )

print(r.choices[0].message.content) print(r.choices[0].message.citations)

Heads up: new accounts start with $2.50 of free credit. The first call charges $0.0025 against that balance — no card required to start.

Authentication

Pass your key as Authorization: Bearer <key> — the standard OpenAI header. The legacy X-API-KEY header is also accepted for backwards compatibility.

Authorization: Bearer pxs_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Keys are 40-hex-char secrets, displayed once at creation. Store them in env vars or a secrets manager — never commit to git or ship to client-side code.
Rotate or revoke from your dashboard any time.
We store only a SHA-256 hash on our end. If you lose the secret, generate a new one.

Models

Four logical models with different price/effort trade-offs. Pick via the standard model field.

Model	Best for	Price
pixserp-fast	Quick lookups, minimal latency, 1 search	$0.0015
pixserp-standard	Default — balanced research, key facts verified	$0.0025
pixserp-deep	Thorough cross-referenced research, multi-angle	$0.0035
pixserp-agent	Multi-step research agent — runs deep rounds on different angles, decides when it has enough	$0.0035 / step

Fast/standard/deep are flat per-request. pixserp-agent bills per step actually run — default 50 steps, configurable up to 100 via extra_body={"max_steps": N}. The model decides when to stop early. List of available models is also exposed at GET https://pixserp.com/api/v1/models.

Using the agent

The agent runs deep research rounds in a loop. Each round explores a different angle of the question; an internal orchestrator decides whether to continue or synthesize. Pass max_steps to cap the loop (default 50, hard cap 100). You only pay for steps actually executed — the model often stops well before the cap.

PythonJavaScriptcurl

copy

r = client.chat.completions.create( model="pixserp-agent", messages=[{"role": "user", "content": "What's driving NYC office vacancy in 2026 and which neighborhoods are bouncing back?"}], extra_body={"max_steps": 30}, )

Chat Completions

POST https://pixserp.com/api/v1/chat/completions — the OpenAI Chat Completions API.

Request body

Field	Type	Description
model	string	pixserp-fast / pixserp-standard / pixserp-deep / pixserp-agent
messages	array	OpenAI message array. We use the LAST user message as the search query.
stream	boolean (false)	Server-sent event chunks when true.
response_format	object	json_object or json_schema for structured output.
depth	string (optional)	Override the model tier without changing the model id (extension).
max_steps	number (optional)	Agent only — caps the loop at N steps. Default 50, max 100.

Response

Standard OpenAI shape, with message.citations as a pixserp extension carrying the structured cards behind the inline [n] markers.

{ "id": "chatcmpl-…", "object": "chat.completion", "created": 1746576000, "model": "pixserp-fast", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "NYC congestion pricing took effect Jan 5, 2025 [1]…", "citations": [ {"id": "1", "kind": "web", "url": "https://digital-strategy.ec.europa.eu/…", "title": "Regulatory framework for AI", "snippet": "…"}, {"id": "3", "kind": "news", "url": "https://reuters.com/…", "title": "…"} ] }, "finish_reason": "stop" } ], "usage": {"prompt_tokens": 412, "completion_tokens": 187, "total_tokens": 599} }

Useful response headers: x-cost-usd, x-pixserp-tool-calls, x-ratelimit-remaining, x-ratelimit-reset.

Responses API

POST https://pixserp.com/api/v1/responses — OpenAI's newer single-turn pattern. Pick this if your codebase has migrated to client.responses.create(); otherwise /chat/completions is the more familiar surface.

copy

r = client.responses.create( model="pixserp-fast", input="Summarize the latest CRISPR developments", )

print(r.output_text)

Citations as Responses-API url_citation annotations

for ann in r.output[0].content[0].annotations: print(ann["url"], ann["title"])

Streaming

Set stream: true to receive answer tokens as they're generated. The SSE wire is OpenAI-standard: each data: line carries a chat.completion.chunk, terminated by data: [DONE].

copy

stream = client.chat.completions.create( model="pixserp-fast", messages=[{"role": "user", "content": "Top-rated ramen near East Village, NYC"}], stream=True, )

for chunk in stream: delta = chunk.choices[0].delta if delta.content: print(delta.content, end="", flush=True) # Citations land on a final delta as a structured array if getattr(delta, "citations", None): for c in delta.citations: print(c["url"])

Citations arrive on the final delta chunk before the finish_reason: "stop" chunk — accumulate them as the stream completes.

Agent progress events

When streaming with pixserp-agent, set extra_body={"pixserp_emit_progress": true} to receive loop-progress events inline on delta.pixserp_event. Standard OpenAI clients ignore unknown delta fields, so this is safe to enable. Useful for rendering a live trace of the agent's reasoning.

Event type	Fired	Payload
agent_step_start	Beginning of each step	{ step, max, sub_query }
agent_orchestrator_decision	After each step (except final)	{ cycle, decision, rationale, next_query }
agent_loop_done	Loop terminates, before synthesis	{ reason, cycles_run }

reason values: orchestrator_done (model decided), no_new_domains (anti-loop bail), step_cap (hit max_steps), no_next_query (orchestrator returned no follow-up).

Citations

Every fact in the answer is grounded to a result the agent fetched. Citations live in two complementary places:

Inline markers in the prose: [1], [2], etc. — Perplexity-style, placed immediately after the fact they support.
Structured array on the message:
- Chat Completions → message.citations
- Responses API → output[0].content[0].annotations (each is a url_citation with start_index / end_index pinning it to the span in the text)

Each citation entry carries a kind (web, news, place, shopping, flight, hotel, video, transcript, image, webpage) plus per-kind structured fields — rating, price, hours, GPS, etc. — so renderers can show rich cards instead of bare links.

// One element from message.citations { "id": "1", "kind": "place", "title": "Ippudo NY", "rating": 4.5, "address":"65 4th Ave, New York, NY 10003", "url": "https://www.google.com/maps/place/…", "markdown": "Ippudo NY — 4.5★ · 65 4th Ave · New York" }

Structured outputs

Pass response_format with a JSON schema and the agent fills it with web-grounded values. Drop straight into typed code without parsing or validation gymnastics.

copy

r = client.chat.completions.create( model="pixserp-fast", messages=[{"role": "user", "content": "Top 3 aerospace companies, CEO, founded year"}], response_format={ "type": "json_schema", "json_schema": { "name": "companies", "schema": { "type": "object", "properties": { "companies": { "type": "array", "items": { "type": "object", "properties": { "name": {"type": "string"}, "ceo": {"type": "string"}, "founded_year": {"type": "integer"}, }, "required": ["name", "ceo", "founded_year"], }, }, }, "required": ["companies"], }, }, }, )

import json data = json.loads(r.choices[0].message.content) for c in data["companies"]: print(c["name"], "-", c["ceo"], "-", c["founded_year"])

When response_format is set, the answer comes back as JSON only — no prose, no markdown fences. The agent searches the web first, then formats its findings into your schema.

Errors

Errors come back in OpenAI shape:

{ "error": { "message": "Invalid or missing API key. Pass it as Authorization: Bearer .", "type": "authentication_error", "code": "invalid_api_key", "param": null } }

Status	type	When
400	invalid_request_error	Missing or malformed body / no user message.
401	authentication_error	Missing, invalid, or revoked API key.
402	insufficient_quota	Balance below the request's flat cost. Top up at /billing.
429	rate_limit_error	RPS cap for your tier exceeded. Retry after x-ratelimit-reset.
502	api_error	Upstream failure (LLM / search provider). Safe to retry.

Rate limits

Per-second cap, scaled by trailing 30-day spend. New accounts start at the lowest tier (Tier 1, 5 RPS) and step up automatically as payments land.

Tier	RPS	Trailing 30d paid
Tier 1	5	$0+
Tier 2	15	$300+
Tier 3	50	$1,000+
Tier 4	300	$5,000+
Tier 5	1,000	$20,000+

Every response carries the live state in headers:

x-ratelimit-tier: Tier 2 x-ratelimit-limit: 15 x-ratelimit-remaining: 12 x-ratelimit-reset: 1746576042

Need a higher cap fast? Email [email protected] with your use case.

MCP server

POST https://pixserp.com/api/v1/mcp — Model Context Protocol endpoint over Streamable HTTP. Adds pixserp as a tool to any MCP-compatible client: Claude Desktop, Cursor, Zed, Claude Code, Cline, Continue. Your AI assistant calls the search tool whenever it needs live web results with citations.

The same pipeline that powers /chat/completions — same answers, same citations, same billing. No new endpoint to learn if you already use pixserp via the OpenAI SDK; just a different transport for clients that speak MCP instead of REST.

Install

Paste the snippet for your client, replace the API key, restart. pixserp appears as a tool named search.

Claude DesktopCursorZedClaude Code / ClineRaw JSON-RPC (curl)

copy

// ~/Library/Application Support/Claude/claude_desktop_config.json // (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows) { "mcpServers": { "pixserp": { "command": "npx", "args": [ "-y", "mcp-remote", "https://pixserp.com/api/v1/mcp", "--header", "Authorization: Bearer pxs_…" ] } } }

Tool reference

Single tool exposed today. Schema is advertised via tools/list — clients pick it up automatically.

Field	Type	Description
query	string (required)	What to search for. Natural-language question or topic.
model	enum (optional)	pixserp-fast (default) / pixserp-standard / pixserp-deep / pixserp-agent.
max_steps	integer (optional)	Only honored for pixserp-agent. Caps the research loop. Default 50, max 100.

Tool result shape (returned in the tools/call response):

{ "content": [ { "type": "text", "text": "NYC congestion pricing took effect Jan 5, 2025 [1]…\n\nSources:\n[1] MTA — https://new.mta.info/…" } ], "structuredContent": { "answer": "NYC congestion pricing took effect …", "citations": [ { "id": "1", "kind": "news", "url": "https://reuters.com/…", "title": "…" } ], "model": "pixserp-fast", "cost_usd": 0.0015 }, "isError": false }

Auth, billing & rate limits

Same API key as the REST endpoints — Authorization: Bearer pxs_….
Each tools/call bills as a chat.completions request at the chosen model's price. MCP is transport, not a separate billable surface.
Rate-limit tiers apply identically — your RPS cap is shared across REST and MCP traffic.
Auth errors surface as JSON-RPC errors with custom codes: -32001 (auth), -32002 (rate-limited), -32003 (insufficient quota). Search failures come back as a tools/call result with isError: true so the calling LLM can reason about them.

Protocol version: 2025-06-18. Stateless transport (no session id) — every request is independent. notifications/initialized and notifications/cancelled are acknowledged silently.

Framework integrations

Anything that speaks OpenAI Chat Completions speaks pixserp — point its base_url / apiBase at us, set the model id, done. No wrapper SDK to install, no per-framework adapter to maintain.

LangChainVercel AI SDKLlamaIndexCursor / Continue

copy

from langchain_openai import ChatOpenAI

llm = ChatOpenAI( model="pixserp-fast", api_key="pxs_…", base_url="https://pixserp.com/api/v1", )

answer = llm.invoke("NYC congestion pricing 2026 update") print(answer.content)

Other tools that work the same way: n8n & Make (use the OpenAI node, swap base URL), the Vercel AI SDK streamText helper, Mastra, Inkeep, Langfuse for tracing, etc.

pixserp

Quickstart

Authentication

Models

Using the agent

Chat Completions

Request body

Response

Responses API

Citations as Responses-API url_citation annotations

Streaming

Agent progress events

Citations

Structured outputs

Errors

Rate limits

MCP server

Install

Tool reference

Auth, billing & rate limits

Framework integrations

Related Servers

Rolli MCP

Web Search

Serper Search

Geocoding

Crossref MCP Server

门店大数据服务

Google Search

Discourse MCP Server

Java Conferences MCP Server

Sci-Hub MCP Server

NotebookLM Web Importer