pixserp

Cited live-web search for AI agents — web, news, places, shopping, flights, hotels, YouTube — one MCP tool.

Use these docs in your AI agent

Copy the URL and paste it into ChatGPT, Claude, Cursor, or any LLM tool with web access. The full reference lives at one stable URL — your agent fetches it directly, you don't have to paste kilobytes of markdown.

Copy docs URLpixserp.com/docs.md

Quickstart

Pixserp is an OpenAI-compatible AI search API. Drop-in for the official openai SDK in any language — set base_url, pick a pixserp-* model, ship. Or use plain HTTP if you prefer curl.

A first call in under a minute. Point the SDK at https://pixserp.com/api/v1, send a user message, get back an answer with inline [1] citations and a structured message.citations array.

  1. Create an API key from your dashboard.
  2. Install the OpenAI SDK for your language (or skip and curl).
  3. Run the snippet below.

PythonJavaScriptcurlGo

copy

from openai import OpenAI

client = OpenAI( api_key="pxs_…", base_url="https://pixserp.com/api/v1", )

r = client.chat.completions.create( model="pixserp-fast", messages=[{"role": "user", "content": "NYC congestion pricing 2026 update"}], )

print(r.choices[0].message.content) print(r.choices[0].message.citations)

Heads up: new accounts start with $2.50 of free credit. The first call charges $0.0025 against that balance — no card required to start.

Authentication

Pass your key as Authorization: Bearer <key> — the standard OpenAI header. The legacy X-API-KEY header is also accepted for backwards compatibility.

Authorization: Bearer pxs_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

  • Keys are 40-hex-char secrets, displayed once at creation. Store them in env vars or a secrets manager — never commit to git or ship to client-side code.
  • Rotate or revoke from your dashboard any time.
  • We store only a SHA-256 hash on our end. If you lose the secret, generate a new one.

Models

Four logical models with different price/effort trade-offs. Pick via the standard model field.

ModelBest forPrice
pixserp-fastQuick lookups, minimal latency, 1 search$0.0015
pixserp-standardDefault — balanced research, key facts verified$0.0025
pixserp-deepThorough cross-referenced research, multi-angle$0.0035
pixserp-agentMulti-step research agent — runs deep rounds on different angles, decides when it has enough$0.0035 / step

Fast/standard/deep are flat per-request. pixserp-agent bills per step actually run — default 50 steps, configurable up to 100 via extra_body={"max_steps": N}. The model decides when to stop early. List of available models is also exposed at GET https://pixserp.com/api/v1/models.

Using the agent

The agent runs deep research rounds in a loop. Each round explores a different angle of the question; an internal orchestrator decides whether to continue or synthesize. Pass max_steps to cap the loop (default 50, hard cap 100). You only pay for steps actually executed — the model often stops well before the cap.

PythonJavaScriptcurl

copy

r = client.chat.completions.create( model="pixserp-agent", messages=[{"role": "user", "content": "What's driving NYC office vacancy in 2026 and which neighborhoods are bouncing back?"}], extra_body={"max_steps": 30}, )

Chat Completions

POST https://pixserp.com/api/v1/chat/completions — the OpenAI Chat Completions API.

Request body

FieldTypeDescription
modelstringpixserp-fast / pixserp-standard / pixserp-deep / pixserp-agent
messagesarrayOpenAI message array. We use the LAST user message as the search query.
streamboolean (false)Server-sent event chunks when true.
response_formatobjectjson_object or json_schema for structured output.
depthstring (optional)Override the model tier without changing the model id (extension).
max_stepsnumber (optional)Agent only — caps the loop at N steps. Default 50, max 100.

Response

Standard OpenAI shape, with message.citations as a pixserp extension carrying the structured cards behind the inline [n] markers.

{ "id": "chatcmpl-…", "object": "chat.completion", "created": 1746576000, "model": "pixserp-fast", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "NYC congestion pricing took effect Jan 5, 2025 [1]…", "citations": [ {"id": "1", "kind": "web", "url": "https://digital-strategy.ec.europa.eu/…", "title": "Regulatory framework for AI", "snippet": "…"}, {"id": "3", "kind": "news", "url": "https://reuters.com/…", "title": "…"} ] }, "finish_reason": "stop" } ], "usage": {"prompt_tokens": 412, "completion_tokens": 187, "total_tokens": 599} }

Useful response headers: x-cost-usd, x-pixserp-tool-calls, x-ratelimit-remaining, x-ratelimit-reset.

Responses API

POST https://pixserp.com/api/v1/responses — OpenAI's newer single-turn pattern. Pick this if your codebase has migrated to client.responses.create(); otherwise /chat/completions is the more familiar surface.

copy

r = client.responses.create( model="pixserp-fast", input="Summarize the latest CRISPR developments", )

print(r.output_text)

Citations as Responses-API url_citation annotations

for ann in r.output[0].content[0].annotations: print(ann["url"], ann["title"])

Streaming

Set stream: true to receive answer tokens as they're generated. The SSE wire is OpenAI-standard: each data: line carries a chat.completion.chunk, terminated by data: [DONE].

copy

stream = client.chat.completions.create( model="pixserp-fast", messages=[{"role": "user", "content": "Top-rated ramen near East Village, NYC"}], stream=True, )

for chunk in stream: delta = chunk.choices[0].delta if delta.content: print(delta.content, end="", flush=True) # Citations land on a final delta as a structured array if getattr(delta, "citations", None): for c in delta.citations: print(c["url"])

Citations arrive on the final delta chunk before the finish_reason: "stop" chunk — accumulate them as the stream completes.

Agent progress events

When streaming with pixserp-agent, set extra_body={"pixserp_emit_progress": true} to receive loop-progress events inline on delta.pixserp_event. Standard OpenAI clients ignore unknown delta fields, so this is safe to enable. Useful for rendering a live trace of the agent's reasoning.

Event typeFiredPayload
agent_step_startBeginning of each step{ step, max, sub_query }
agent_orchestrator_decisionAfter each step (except final){ cycle, decision, rationale, next_query }
agent_loop_doneLoop terminates, before synthesis{ reason, cycles_run }

reason values: orchestrator_done (model decided), no_new_domains (anti-loop bail), step_cap (hit max_steps), no_next_query (orchestrator returned no follow-up).

Citations

Every fact in the answer is grounded to a result the agent fetched. Citations live in two complementary places:

  • Inline markers in the prose: [1], [2], etc. — Perplexity-style, placed immediately after the fact they support.
  • Structured array on the message:
    • Chat Completions → message.citations
    • Responses API → output[0].content[0].annotations (each is a url_citation with start_index / end_index pinning it to the span in the text)

Each citation entry carries a kind (web, news, place, shopping, flight, hotel, video, transcript, image, webpage) plus per-kind structured fields — rating, price, hours, GPS, etc. — so renderers can show rich cards instead of bare links.

// One element from message.citations { "id": "1", "kind": "place", "title": "Ippudo NY", "rating": 4.5, "address":"65 4th Ave, New York, NY 10003", "url": "https://www.google.com/maps/place/…", "markdown": "Ippudo NY — 4.5★ · 65 4th Ave · New York" }

Structured outputs

Pass response_format with a JSON schema and the agent fills it with web-grounded values. Drop straight into typed code without parsing or validation gymnastics.

copy

r = client.chat.completions.create( model="pixserp-fast", messages=[{"role": "user", "content": "Top 3 aerospace companies, CEO, founded year"}], response_format={ "type": "json_schema", "json_schema": { "name": "companies", "schema": { "type": "object", "properties": { "companies": { "type": "array", "items": { "type": "object", "properties": { "name": {"type": "string"}, "ceo": {"type": "string"}, "founded_year": {"type": "integer"}, }, "required": ["name", "ceo", "founded_year"], }, }, }, "required": ["companies"], }, }, }, )

import json data = json.loads(r.choices[0].message.content) for c in data["companies"]: print(c["name"], "-", c["ceo"], "-", c["founded_year"])

When response_format is set, the answer comes back as JSON only — no prose, no markdown fences. The agent searches the web first, then formats its findings into your schema.

Errors

Errors come back in OpenAI shape:

{ "error": { "message": "Invalid or missing API key. Pass it as Authorization: Bearer .", "type": "authentication_error", "code": "invalid_api_key", "param": null } }

StatustypeWhen
400invalid_request_errorMissing or malformed body / no user message.
401authentication_errorMissing, invalid, or revoked API key.
402insufficient_quotaBalance below the request's flat cost. Top up at /billing.
429rate_limit_errorRPS cap for your tier exceeded. Retry after x-ratelimit-reset.
502api_errorUpstream failure (LLM / search provider). Safe to retry.

Rate limits

Per-second cap, scaled by trailing 30-day spend. New accounts start at the lowest tier (Tier 1, 5 RPS) and step up automatically as payments land.

TierRPSTrailing 30d paid
Tier 15$0+
Tier 215$300+
Tier 350$1,000+
Tier 4300$5,000+
Tier 51,000$20,000+

Every response carries the live state in headers:

x-ratelimit-tier: Tier 2 x-ratelimit-limit: 15 x-ratelimit-remaining: 12 x-ratelimit-reset: 1746576042

Need a higher cap fast? Email [email protected] with your use case.

MCP server

POST https://pixserp.com/api/v1/mcp — Model Context Protocol endpoint over Streamable HTTP. Adds pixserp as a tool to any MCP-compatible client: Claude Desktop, Cursor, Zed, Claude Code, Cline, Continue. Your AI assistant calls the search tool whenever it needs live web results with citations.

The same pipeline that powers /chat/completions — same answers, same citations, same billing. No new endpoint to learn if you already use pixserp via the OpenAI SDK; just a different transport for clients that speak MCP instead of REST.

Install

Paste the snippet for your client, replace the API key, restart. pixserp appears as a tool named search.

Claude DesktopCursorZedClaude Code / ClineRaw JSON-RPC (curl)

copy

// ~/Library/Application Support/Claude/claude_desktop_config.json // (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows) { "mcpServers": { "pixserp": { "command": "npx", "args": [ "-y", "mcp-remote", "https://pixserp.com/api/v1/mcp", "--header", "Authorization: Bearer pxs_…" ] } } }

Tool reference

Single tool exposed today. Schema is advertised via tools/list — clients pick it up automatically.

FieldTypeDescription
querystring (required)What to search for. Natural-language question or topic.
modelenum (optional)pixserp-fast (default) / pixserp-standard / pixserp-deep / pixserp-agent.
max_stepsinteger (optional)Only honored for pixserp-agent. Caps the research loop. Default 50, max 100.

Tool result shape (returned in the tools/call response):

{ "content": [ { "type": "text", "text": "NYC congestion pricing took effect Jan 5, 2025 [1]…\n\nSources:\n[1] MTA — https://new.mta.info/…" } ], "structuredContent": { "answer": "NYC congestion pricing took effect …", "citations": [ { "id": "1", "kind": "news", "url": "https://reuters.com/…", "title": "…" } ], "model": "pixserp-fast", "cost_usd": 0.0015 }, "isError": false }

Auth, billing & rate limits

  • Same API key as the REST endpoints — Authorization: Bearer pxs_….
  • Each tools/call bills as a chat.completions request at the chosen model's price. MCP is transport, not a separate billable surface.
  • Rate-limit tiers apply identically — your RPS cap is shared across REST and MCP traffic.
  • Auth errors surface as JSON-RPC errors with custom codes: -32001 (auth), -32002 (rate-limited), -32003 (insufficient quota). Search failures come back as a tools/call result with isError: true so the calling LLM can reason about them.

Protocol version: 2025-06-18. Stateless transport (no session id) — every request is independent. notifications/initialized and notifications/cancelled are acknowledged silently.

Framework integrations

Anything that speaks OpenAI Chat Completions speaks pixserp — point its base_url / apiBase at us, set the model id, done. No wrapper SDK to install, no per-framework adapter to maintain.

LangChainVercel AI SDKLlamaIndexCursor / Continue

copy

from langchain_openai import ChatOpenAI

llm = ChatOpenAI( model="pixserp-fast", api_key="pxs_…", base_url="https://pixserp.com/api/v1", )

answer = llm.invoke("NYC congestion pricing 2026 update") print(answer.content)

Other tools that work the same way: n8n & Make (use the OpenAI node, swap base URL), the Vercel AI SDK streamText helper, Mastra, Inkeep, Langfuse for tracing, etc.

Related Servers

NotebookLM Web Importer

Import web pages and YouTube videos to NotebookLM with one click. Trusted by 200,000+ users.

Install Chrome Extension