PinRAG
Cited RAG MCP: PDFs, GitHub, YouTube, Discord exports, local files, one shared index.
![]()
PinRAG
Overview
PinRAG is for when you want to learn about something and your materials are scattered—PDFs and ebooks, GitHub repos, YouTube videos, Discord discussions, and plain notes. You index those materials into one shared RAG index, then ask questions from Cursor, VS Code (GitHub Copilot), or any MCP-capable assistant and get answers with citations pointing back to pages, timestamps, files, or threads.
Under the hood it is Retrieval-Augmented Generation built with LangChain and exposed as an MCP (Model Context Protocol) server: add documents from the editor, query with natural language, list or remove what you indexed. Supported inputs include PDFs, local text files and directories, Discord exports, YouTube (transcript from URL, playlist, or ID), and GitHub repo URLs. For YouTube you can optionally add vision so on-screen code, diagrams, and UI text are merged with the transcript in the same chunks—see YouTube vision enrichment.
Features
- Multi-format indexing — PDF (.pdf), local files or directories, plain text (.txt), Discord export (.txt), YouTube (video or playlist URL, or video ID), GitHub repo (URL)
- Optional YouTube vision — Off by default. When enabled, runs a vision model (OpenAI, Anthropic, or OpenRouter native video) and merges structured on-screen context with the transcript so RAG chunks carry searchable code names, labels, and diagrams—not speech alone. OpenRouter mode avoids local ffmpeg/video download; openai/anthropic use scene keyframes and require
pinrag[vision]+ ffmpeg (see YouTube vision enrichment) - RAG with citations — Answers cite source context: PDF page, YouTube timestamp, document name for plain text and Discord, file path for GitHub repos
- Document tags — Tag documents at index time (e.g.
AMIGA,PI_PICO) for filtered search - Metadata filtering —
query_toolsupportsdocument_id,tag,document_type, PDFpage_min/page_max, GitHubfile_path, andresponse_style(thorough or concise) - MCP tools —
add_document_tool(files, dirs, or URLs),query_tool,list_documents_tool,remove_document_tool - MCP resources —
pinrag://documents(indexed documents) andpinrag://server-config(env vars and config); click in Cursor’s MCP panel to view - MCP prompt —
use_pinrag(parameter: request) for querying, indexing, listing, or removing documents - Configurable LLM — OpenRouter (default, free
openrouter/freerouter), OpenAI, Anthropic, or Cerebras Inference (OpenAI-compatible API); set viaPINRAG_LLM_PROVIDERandPINRAG_LLM_MODELin MCPenvor your shell - Local embeddings — Nomic (
PINRAG_EMBEDDING_MODEL, defaultnomic-embed-text-v1.5); no API key; first run downloads model weights (~270 MB, cached) - Retrieval & chunking options — Structure-aware chunking (on by default); optional FlashRank re-ranking, multi-query expansion, and parent-child chunks for PDFs (see Configuration)
- Observability — MCP tool notifications (
ctx.log) plus optional LangSmith tracing - Built with — LangChain, Chroma; optional OpenRouter, OpenAI, Anthropic, FlashRank
Installation
Add PinRAG as an MCP server in your editor. Install uv and ensure uvx is on your PATH—that runs PinRAG from PyPI without a prior pip install.
Cursor: add this under mcpServers in ~/.cursor/mcp.json:
{
"mcpServers": {
"pinrag": {
"command": "uvx",
"args": ["--refresh", "pinrag"],
"env": {
"OPENROUTER_API_KEY": "your-openrouter-api-key-here",
"PINRAG_PERSIST_DIR": "/absolute/path/to/your/pinrag-data"
}
}
}
}
VS Code (GitHub Copilot): run MCP: Open User Configuration from the Command Palette (or add .vscode/mcp.json in a workspace), then merge this shape—top-level key is servers:
{
"servers": {
"pinrag": {
"command": "uvx",
"args": ["--refresh", "pinrag"],
"env": {
"OPENROUTER_API_KEY": "your-openrouter-api-key-here",
"PINRAG_PERSIST_DIR": "/absolute/path/to/your/pinrag-data"
}
}
}
}
Quick Start
Configure MCP server
Put API keys and any PinRAG settings in the MCP entry’s env block. The server does not load .env files when the editor launches it.
Use in chat
| Action | Tool |
|---|---|
| Index files, directories, or URLs | add_document_tool — required paths: list of local paths (PDFs, plain or DiscordChatExporter .txt, directories) or URLs (YouTube videos, playlist URLs, GitHub repos; bare YouTube video IDs allowed). Optional tags (one per path). For GitHub URLs only: branch, include_patterns, exclude_patterns. |
| List indexed documents | list_documents_tool — returns documents (IDs), total_chunks, and optional tag filter. document_details may include document_type, tags, page / message / segment counts, titles, GitHub files / file_count, and upload_timestamp when present in metadata. |
| Query with filters | query_tool — required query. Optional document_id, tag, document_type, page_min / page_max (PDF ranges), file_path (path inside a GitHub doc), response_style (thorough or concise; leave empty to use PINRAG_RESPONSE_STYLE). |
| Remove a document | remove_document_tool — required document_id (exact value from list_documents_tool). |
| View resources (read-only) | In the MCP panel, open Resources and choose pinrag://documents (indexed docs) or pinrag://server-config (effective config, including PINRAG_VERSION). |
Ask in chat: "Add /path/to/amiga-book.pdf with tag AMIGA", "Index https://youtu.be/xyz and ask what it says", or "Index https://github.com/owner/repo and ask about the codebase". The AI will invoke the tools for you. Citations show page numbers for PDFs, timestamps (e.g. t. 1:23) for YouTube, document names for plain text and Discord exports, and file paths for GitHub.
GitHub indexing
Index a repo with add_document_tool and a URL in paths, e.g. https://github.com/owner/repo, https://github.com/owner/repo/tree/branch, or github.com/owner/repo (scheme optional).
GitHub-only options: branch, include_patterns / exclude_patterns — defaults already favor common text and source files and skip bulky artifacts; use patterns when you need files outside that set. Files over PINRAG_GITHUB_MAX_FILE_BYTES (default 512 KiB) are skipped.
Auth: Set GITHUB_TOKEN in MCP env (or the shell) for private repos or fewer rate-limit hits on big indexes; small public runs often work without it. Use a classic or fine-grained PAT with repo read access; there is no OAuth in PinRAG.
YouTube indexing and IP blocking
Transcript-heavy indexing—especially from cloud or high-volume IPs—may return errors like "YouTube is blocking requests from your IP". Point youtube-transcript-api at a proxy via MCP env (or your shell):
PINRAG_YT_PROXY_HTTP_URL=http://user:[email protected]:80
PINRAG_YT_PROXY_HTTPS_URL=http://user:[email protected]:80
PINRAG_YT_PROXY_* affects transcript fetches only; yt-dlp steps (titles, playlists) do not use it. Residential or rotating proxies usually fare better than raw datacenter IPs.
When some paths fail (e.g. a few videos in a playlist), add_document_tool includes fail_summary with counts keyed by blocked, disabled, missing_transcript, and other.
YouTube vision enrichment (optional)
Default indexing is transcript-only. Set PINRAG_YT_VISION_ENABLED=true to add vision captions for on-screen content, time-aligned with the transcript and chunked with metadata such as has_visual, frame_count, and visual_source.
PINRAG_YT_VISION_PROVIDER:
openai(default) oranthropic:yt-dlpdownload → scene-based frames → one multimodal call per frame. Needspinrag[vision], ffmpeg/ffprobe onPATH, andOPENAI_API_KEYorANTHROPIC_API_KEY(install the extra in the same env aspinrag, e.g.uv sync --extra visionorpip install 'pinrag[vision]').openrouter: one OpenRouter request per video viavideo_url(defaultgoogle/gemini-2.5-flash).OPENROUTER_API_KEYonly—no download, ffmpeg, orpinrag[vision]; choose a video-capable model if you overridePINRAG_YT_VISION_MODEL.
Ops: Re-index after changing vision settings. For openai/anthropic, tune cost and timeouts with PINRAG_YT_VISION_MAX_FRAMES and optional PINRAG_YT_VISION_IMAGE_DETAIL=high (clearer small text, more tokens). MCP stdio: yt-dlp progress goes to stderr so stdout stays JSON-clean. Downloading video can breach YouTube ToS or local rules—your call. Docker: build with BUILD_WITH_VISION=1 for ffmpeg + pinrag[vision] (see Dockerfile).
Tips
pinragnot found: MCP inherits your login PATH. Afterpipxoruv tool install, restart the editor and confirmwhich pinrag.PINRAG_PERSIST_DIR: Use a stable absolute path in MCPenv(e.g.~/.pinrag/chroma_db) so the vector store does not depend on the server process cwd.- FlashRank: Install
pinrag[rerank]in the same tool env (pipx install 'pinrag[rerank]'/uv tool install 'pinrag[rerank]'); tunables are in Configuration. - YouTube vision: Follow YouTube vision enrichment for env and deps; re-index after changing vision settings.
pinrag://server-config: MCP Resources → this URI forPINRAG_VERSION, effective LLM/embeddings/chunking, and API key set / not set status.
Configuration
The pinrag://server-config MCP resource prints PINRAG_VERSION (package version, not an env var you set) and effective values for the variables below, plus which API keys are set. Use the table as the full env reference.
Environment variables:
| Variable | Default | Description |
|---|---|---|
| LLM | ||
| Provider & model | ||
PINRAG_LLM_PROVIDER | openrouter | openrouter, openai, anthropic, or cerebras |
PINRAG_LLM_MODEL | (provider default) | When unset: OpenRouter openrouter/free, OpenAI gpt-4o-mini, Anthropic claude-haiku-4-5, Cerebras llama3.1-8b. Override with any model id (e.g. OpenRouter anthropic/claude-sonnet-4-6, Cerebras gpt-oss-120b). |
| OpenRouter | ||
PINRAG_OPENROUTER_MODEL_FALLBACKS | (unset) | Comma-separated fallback model slugs sent as OpenRouter’s models list. The gateway tries the next slug when the primary (PINRAG_LLM_MODEL) fails (rate limits, downtime, etc.). Use extra free models here to stay zero-cost. Legacy alias: PINRAG_LLM_MODEL_FALLBACKS. |
PINRAG_OPENROUTER_SORT | (unset) | Optional provider.sort — price, throughput, or latency. When unset, OpenRouter uses its default provider selection. Prefer leaving this unset if you set PINRAG_OPENROUTER_PROVIDER_ORDER to pin a specific backend (avoids conflicting routing signals). |
PINRAG_OPENROUTER_PROVIDER_ORDER | (unset) | Comma-separated provider names for provider.order (tried in sequence). Example: Cerebras with PINRAG_LLM_MODEL=openai/gpt-oss-120b to prefer Cerebras-backed routing. Use exact labels from the model’s provider list on OpenRouter. |
OPENROUTER_APP_URL | https://github.com/ndjordjevic/pinrag | App attribution (HTTP-Referer). Override with your site URL (see OpenRouter app attribution). PinRAG copies this into OPENROUTER_HTTP_REFERER for the OpenRouter Python SDK. |
OPENROUTER_APP_TITLE | PinRAG | App title (X-Title). Override to label usage in the OpenRouter dashboard. PinRAG copies this into OPENROUTER_X_OPEN_ROUTER_TITLE for the SDK. |
| API keys | ||
OPENROUTER_API_KEY | (required when using OpenRouter for LLM, evaluators, or YouTube vision) | Required when PINRAG_LLM_PROVIDER=openrouter, PINRAG_EVALUATOR_PROVIDER=openrouter, or YouTube vision with PINRAG_YT_VISION_PROVIDER=openrouter. |
OPENAI_API_KEY | (required for OpenAI LLM or OpenAI YouTube vision) | Required when PINRAG_LLM_PROVIDER=openai, or when PINRAG_YT_VISION_ENABLED=true and PINRAG_YT_VISION_PROVIDER=openai. |
OPENAI_BASE_URL | (optional) | Override OpenAI API base URL (e.g. https://openrouter.ai/api/v1 with OPENAI_API_KEY set to your OpenRouter key for vision or other OpenAI-compatible calls). |
CEREBRAS_API_KEY | (required for Cerebras LLM) | Required when PINRAG_LLM_PROVIDER=cerebras. Get a key from the Cerebras cloud console. |
PINRAG_CEREBRAS_BASE_URL | https://api.cerebras.ai/v1 | Override the OpenAI-compatible base URL for Cerebras (e.g. dedicated inference endpoints). |
ANTHROPIC_API_KEY | (required for Anthropic LLM or Anthropic YouTube vision) | Required when PINRAG_LLM_PROVIDER=anthropic, PINRAG_EVALUATOR_PROVIDER=anthropic, or YouTube vision with PINRAG_YT_VISION_PROVIDER=anthropic. |
| Embeddings | ||
PINRAG_EMBEDDING_MODEL | nomic-embed-text-v1.5 | Local Nomic model id (via langchain-nomic). First run downloads weights (~270 MB, cached). No API key. |
| Storage & chunking | ||
PINRAG_PERSIST_DIR | chroma_db | Chroma vector store directory (default is relative to the server process cwd unless you set an absolute path; e.g. ~/.pinrag/chroma_db for a fixed location) |
PINRAG_CHUNK_SIZE | 1000 | Text chunk size (chars) |
PINRAG_CHUNK_OVERLAP | 200 | Chunk overlap (chars) |
PINRAG_STRUCTURE_AWARE_CHUNKING | true | Apply structure-aware chunking heuristics for code/table boundaries |
PINRAG_COLLECTION_NAME | pinrag | Chroma collection name. Single shared collection by default. |
ANONYMIZED_TELEMETRY | False via setdefault when unset | Chroma telemetry flag. PinRAG’s MCP logging setup calls os.environ.setdefault("ANONYMIZED_TELEMETRY", "False") so empty/unset behaves like opt-out; set true in env if you want Chroma’s telemetry on. |
| Retrieval | ||
PINRAG_RETRIEVE_K | 20 | Retrieval pool size when re-ranking is off. When re-ranking is on, PINRAG_RERANK_RETRIEVE_K falls back to this value if unset, then results are cut to PINRAG_RERANK_TOP_N. |
| Parent-child retrieval | ||
PINRAG_USE_PARENT_CHILD | false | Set to true to embed small chunks and return larger parent chunks (supported for PDF, GitHub, YouTube, and Discord indexing—not plain .txt). Requires re-indexing. |
PINRAG_PARENT_CHUNK_SIZE | 2000 | Parent chunk size (chars) when PINRAG_USE_PARENT_CHILD=true. |
PINRAG_CHILD_CHUNK_SIZE | 800 | Child chunk size (chars) when PINRAG_USE_PARENT_CHILD=true. |
| Re-ranking | ||
PINRAG_USE_RERANK | false | Set to true to enable FlashRank re-ranking: fetch more chunks, re-score locally, pass top N to the LLM. Requires pip install pinrag[rerank] and no API key. |
PINRAG_RERANK_RETRIEVE_K | (inherits PINRAG_RETRIEVE_K) | Chunks to fetch before FlashRank when PINRAG_USE_RERANK=true. If unset, equals PINRAG_RETRIEVE_K (not a separate hard-coded 20). |
PINRAG_RERANK_TOP_N | 10 | Chunks passed to the LLM after re-ranking when PINRAG_USE_RERANK=true (capped by the pre-rerank fetch size). |
| Multi-query | ||
PINRAG_USE_MULTI_QUERY | false | Set to true to generate alternative phrasings of the user query via LLM, retrieve per variant, and merge (unique union). Improves recall for terse or ambiguous queries. |
PINRAG_MULTI_QUERY_COUNT | 4 | Number of alternative queries to generate (default 4, max 10). The original query is still included in retrieval when merging. |
| Response style | ||
PINRAG_RESPONSE_STYLE | thorough | RAG answer style: thorough (detailed) or concise. Used by evaluation target and as default when MCP query omits response_style. |
| MCP notifications | ||
PINRAG_VERBOSE_LOGGING | false | Set true to emit detailed per-phase MCP notifications for tool/resource execution (format detection, transcript load, vision path/steps, chunk upserts). Default keeps concise start/ok/error lifecycle logs. |
| GitHub indexing | ||
GITHUB_TOKEN | (optional) | Personal access token for GitHub API. Required for private repos; increases rate limits for public repos. |
PINRAG_GITHUB_MAX_FILE_BYTES | 524288 (512 KB) | Skip files larger than this when indexing GitHub repos. |
PINRAG_GITHUB_DEFAULT_BRANCH | main | Default branch when not specified in the GitHub URL. |
| Plain text indexing | ||
PINRAG_PLAINTEXT_MAX_FILE_BYTES | 524288 (512 KB) | Skip plain .txt files larger than this when indexing. |
| YouTube transcript proxy | ||
PINRAG_YT_PROXY_HTTP_URL | (none) | HTTP proxy URL for transcript fetches (e.g. http://user:pass@proxy:80). Use when YouTube blocks your IP. |
PINRAG_YT_PROXY_HTTPS_URL | (none) | HTTPS proxy URL for transcript fetches. Same as HTTP when using a generic proxy. |
| YouTube vision (optional) | ||
PINRAG_YT_VISION_ENABLED | false | true / 1 / yes / on enables on-screen enrichment for YouTube. openai / anthropic: needs pinrag[vision], ffmpeg on PATH, and the matching API key. openrouter: needs OPENROUTER_API_KEY only (native video_url path; no local download). |
PINRAG_YT_VISION_PROVIDER | openai | openai, anthropic, or openrouter. Independent of PINRAG_LLM_PROVIDER (RAG LLM and vision can use different providers). Legacy alias: PINRAG_VISION_PROVIDER. |
PINRAG_YT_VISION_MODEL | (per provider) | If unset: OpenAI gpt-4o-mini, Anthropic claude-sonnet-4-6, OpenRouter google/gemini-2.5-flash. Use a vision-capable id. Legacy alias: PINRAG_VISION_MODEL. |
PINRAG_YT_VISION_MAX_FRAMES | 8 | Download + frame path only (openai / anthropic): cap analyzed keyframes after scene detect. Ignored for openrouter (single video request). |
PINRAG_YT_VISION_MIN_SCENE_SCORE | 27.0 | Download + frame path only: PySceneDetect AdaptiveDetector threshold (higher → fewer cuts). Ignored for openrouter. |
PINRAG_YT_VISION_IMAGE_DETAIL | low | OpenAI frame path only: low, high, or auto for image_url.detail. Ignored for anthropic (full frames) and openrouter (video_url). |
| LangSmith (optional) | ||
LANGSMITH_TRACING | (off) | Set true to send traces to LangSmith. Requires LANGSMITH_API_KEY. |
LANGSMITH_API_KEY | (none) | API key from LangSmith Settings → API keys. |
LANGSMITH_PROJECT | (LangChain default) | Project name for traces (e.g. pinrag). |
LANGSMITH_ENDPOINT | US API (implicit) | EU workspaces: set https://eu.api.smith.langchain.com so traces land in your EU project. If your account uses eu.smith.langchain.com in the browser, you need this. US-region workspaces can omit it (default API host). |
| Evaluators (LLM-as-judge) | ||
PINRAG_EVALUATOR_PROVIDER | openai | openai, anthropic, or openrouter — which LLM runs LLM-as-judge graders. Used only during evaluation runs (LangSmith experiments). |
PINRAG_EVALUATOR_MODEL | (provider default) | Model for correctness grading (e.g. gpt-4o, claude-sonnet-4-6, openrouter/free when evaluator provider is OpenRouter). With OpenRouter, the default free router may rotate models; graders use strict JSON schema—set this to a specific free slug from openrouter.ai/models if you need stable structured output. OpenRouter routing env vars below also apply to graders when PINRAG_EVALUATOR_PROVIDER=openrouter. |
PINRAG_EVALUATOR_MODEL_CONTEXT | (provider default) | Model for groundedness grading (large retrieved context; e.g. gpt-4o-mini, claude-haiku-4-5, openrouter/free when evaluator provider is OpenRouter). Same OpenRouter note as PINRAG_EVALUATOR_MODEL. When the evaluator provider is OpenRouter, PINRAG_OPENROUTER_MODEL_FALLBACKS, PINRAG_OPENROUTER_SORT, and PINRAG_OPENROUTER_PROVIDER_ORDER apply to the grader client. |
Re-indexing when changing embedding model: Changing
PINRAG_EMBEDDING_MODELrequires re-indexing; vector dimensions must match the model used at index time (including indexes created under an older default embedding).Re-indexing when enabling parent-child: Setting
PINRAG_USE_PARENT_CHILD=truerequires re-indexing; the new structure (child chunks in Chroma, parent chunks in docstore) is created only during indexing for supported document types (not plain.txt).Re-indexing when toggling YouTube vision: Turning
PINRAG_YT_VISION_ENABLEDon or off, changingPINRAG_YT_VISION_PROVIDER, or changing vision model /PINRAG_YT_VISION_IMAGE_DETAIL/ frame limits, requires re-indexing affected YouTube documents for chunks to reflect the new behavior.
Monitoring & Observability
For query performance metrics (latency, timing, token usage) and debugging, use LangSmith. Set LANGSMITH_TRACING=true and LANGSMITH_API_KEY in MCP env or your shell; optionally set LANGSMITH_PROJECT (see table above). If your LangSmith workspace is in the EU region (you use eu.smith.langchain.com in the browser), you must also set LANGSMITH_ENDPOINT=https://eu.api.smith.langchain.com; without it, traces may not show up in the EU deployment. US-region accounts use the default API host and do not need LANGSMITH_ENDPOINT. See notes/langsmith-setup.md for more detail.
For MCP-side introspection, set PINRAG_VERBOSE_LOGGING=true to surface detailed phase events in notifications/message (e.g., YouTube transcript load, whether vision runs, and chunk upsert milestones).
Multiple providers and collections
Vector dimension is fixed per Chroma collection and must match the PINRAG_EMBEDDING_MODEL used when chunks were written. The default id nomic-embed-text-v1.5 is a 768-d Nomic model; another PINRAG_EMBEDDING_MODEL value may imply a different size—check that model’s documentation.
- Default:
PINRAG_COLLECTION_NAMEdefaults topinrag. Do not changePINRAG_EMBEDDING_MODELfor an existing collection without re-indexing into a new collection (or wiping the old one); otherwise adds/queries can fail with embedding dimension errors. - Per-model collections: Use a stable pair of
PINRAG_EMBEDDING_MODEL+PINRAG_COLLECTION_NAME(+PINRAG_PERSIST_DIRif you isolate stores) for each index. To query a collection, set the same env values you used when indexing it. You can index the same sources again under another pair (changeenv, restart MCP if needed, runadd_document_tool). - MCP tools:
server.pywires every tool toconfig.get_persist_dir()andconfig.get_collection_name()only—there is no per-call collection or persist argument (tools.query/add_filesaccept optionalcollectionin Python, but MCP does not expose that).
MCP reference
Tools, prompt, and read-only resources from the pinrag MCP server (FastMCP("PinRAG")). Tool results are JSON objects that always include _server_version; with PINRAG_VERBOSE_LOGGING=true they may include _verbose_log.
add_document_tool returns indexed, failed, counts, persist_directory, collection_name, and fail_summary when any path failed. query_tool returns answer and sources (each entry: document_id, page—PDF page, often 0 for non-PDF—plus optional start in seconds for YouTube and file_path for GitHub).
query_tool
Natural-language question; optional filters narrow retrieval ("" / omit when unused):
| Parameter | Description |
|---|---|
query | Question (required) |
document_id | Limit to this document (id from list_documents_tool, e.g. PDF file name or YouTube video id) |
page_min, page_max | Inclusive PDF page range (must pass both; one page: same value twice) |
tag | Only chunks with this tag |
document_type | pdf, youtube, discord, github, or plaintext |
file_path | Path inside a doc (GitHub; see document_details / files from list_documents_tool) |
response_style | thorough or concise. Empty (the schema default) or any other string → resolved via PINRAG_RESPONSE_STYLE (see server.py: only those two literals override env). |
Filters can be combined. The sources list uses page for PDFs, start (seconds) for YouTube, file_path for GitHub; answers may show t. M:SS labels derived from start.
Example: "What is OpenOCD? In the Pico doc, pages 16–17 only" → query_tool(query="What is OpenOCD?", document_id="RP-008276-DS-1-getting-started-with-pico.pdf", page_min=16, page_max=17).
add_document_tool
Index locals (PDF, plain or Discord .txt, directories), YouTube (video URL, playlist URL, or bare id), or GitHub URLs (scheme optional). paths batches work items; one failed path does not roll back others. Persists to PINRAG_PERSIST_DIR / PINRAG_COLLECTION_NAME only (no MCP parameters for those).
| Parameter | Description |
|---|---|
paths | Required list: files, dirs, URLs, or video ids |
tags | Optional; one per paths entry, same order |
branch | GitHub only: branch override |
include_patterns | GitHub only: glob include list |
exclude_patterns | GitHub only: glob exclude list |
list_documents_tool
Returns documents, total_chunks, persist_directory, collection_name, and document_details (tags, titles, counts, upload_timestamp when present, GitHub files / file_count, etc.). If tag is set, total_chunks counts only chunks with that tag (not the whole collection).
| Parameter | Description |
|---|---|
tag | Optional: only docs that have this tag |
remove_document_tool
Deletes every chunk for document_id (exact id from list_documents_tool).
| Parameter | Description |
|---|---|
document_id | Required |
MCP prompt: use_pinrag
Built-in routing blurb: request is interpolated as the first line; the rest lists when to use each tool and their parameters (matches use_pinrag in server.py). Listed wherever the client exposes MCP prompts (e.g. Cursor).
| Parameter | Description |
|---|---|
request | Optional user goal (may be empty) |
MCP resources
| Resource | Description |
|---|---|
pinrag://documents | Plain-text listing for the server’s configured collection (from format_documents_list) |
pinrag://server-config | Printable dump of effective env/config (includes PINRAG_VERSION, key operational vars, API key presence) |
Running tests
From the repo root, install dev extras (e.g. uv sync --extra dev).
-
Fast (no
integration):
uv run pytest tests/ -q -m "not integration"
Skips anything markedintegrationinpyproject.toml(network, API keys, optional assets, MCP stdio). Any test that uses thesample_pdf_pathfixture gets that marker automatically intests/conftest.py, so the sample PDF underdata/pdfs/is only needed for the full run. -
Full suite:
uv run pytest tests/ -qSecrets: For MCP stdio tests, the subprocess env starts from your shell, then any missing
OPENAI_API_KEY/ANTHROPIC_API_KEYare filled fromtests/.mcp_stdio_integration.env(copy fromtests/mcp_stdio_integration.env.example; only those keys are read from the file—already-set env wins). Override the file withPINRAG_MCP_ITEST_ENV_FILE. After merge,test_mcp_stdio_repo.pyrequiresOPENAI_API_KEY, andPINRAG_LLM_PROVIDERmust pass its credential check (e.g. exportOPENROUTER_API_KEYwhen using OpenRouter).test_mcp_stdio_pypi.pyalso requires a working OpenAI key.PDF / stdio: Default PDF is
data/pdfs/sample-text.pdf(not in git). Override withPINRAG_MCP_ITEST_PDF/PINRAG_MCP_ITEST_QUERY. Stdio tests needuvonPATH, or setPINRAG_TEST_UVto the binary path.PyPI MCP test: Marked
pypi_mcp; skip with-m "not pypi_mcp"orPINRAG_MCP_ITEST_SKIP_PYPI=1. Pin the install withPINRAG_MCP_ITEST_PYPI_SPEC(defaultpinrag= latest on PyPI).Verbose:
--log-cli-level=INFO.
The data/ directory is gitignored—create data/pdfs/ (and similar) locally; nothing under data/ is committed.
License
MIT License. Full text in LICENSE.
Servidores relacionados
Scout Monitoring MCP
patrocinadorPut performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
patrocinadorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
JSON MCP
MCP server empowers LLMs to interact with JSON files efficiently. With JSON MCP, you can split, merge, etc.
Hoverfly MCP Server
An MCP server exposing Hoverfly as a programmable API simulation tool for AI assistants.
context-mem
Context optimization for AI coding assistants — 99% token savings via 14 content-aware summarizers, 3-layer search, and progressive disclosure. No LLM dependency.
Bedrock Server Manager
Manage your Bedrock server with natural language commands using the Bedrock Server Manager API.
MCP Quickstart
A basic MCP server from the Quickstart Guide, adapted for OpenAI's Chat Completions API.
MCP Todo Server
A demo Todo application server built with a clean architecture using MCPServer and JSON Placeholder.
Template MCP Server
A CLI template for quickly bootstrapping an MCP server with FastMCP, supporting both stdio and HTTP transport.
Remote MCP Server (Authless)
An example of a remote MCP server deployable on Cloudflare Workers, without authentication.
Page Design Guide MCP Server
MCP server providing modern web design guidance, color psychology, typography, layout patterns, and UI best practices for AI assistants
Gemma MCP Client
A client for Google's Gemma-3 model that enables function calling through MCP.