RTFM

Open-source multi-domain retrieval layer for AI agents — FTS5 + semantic search, 10 parsers, knowledge graph, Obsidian integration, MCP native.

RTFM

Retrieve The Forgotten Memory

The open retrieval layer your AI agent was missing

Index everything in your project — code, docs, PDFs, legal texts, research, data — and your agent finds the right context instantly. No hallucinations. No cloud. No API costs.

Free · Local · Open Source · MIT

PyPI version License: MIT Python MCP Claude Code GitHub stars


The problem

Your AI agent is flying blind.

It greps through thousands of files, misses the doc that answers the question, invents modules that don't exist, forgets what you decided last session. The bigger the project, the worse it gets. You've added a smarter model. It didn't help. Because the bottleneck isn't intelligence — it's retrieval.

Code indexers (Augment, Sourcegraph, Cursor) only see code. But your project isn't just code. It's specs, PRs, architecture decisions, research papers, PDFs, regulations, vault notes — the context your agent needs to stop guessing.

Why I built this

I was writing a French tax article (~50 pages of regulatory text, cross-references between code articles, case law, administrative doctrine). Claude Code kept grep-ing the same directories in loops, running out of context, and producing confidently wrong citations. I'd added more memory, better prompts, a smarter model. None of it worked, because the agent wasn't reasoning badly — it just couldn't find the right paragraph in a 2,000-file legal corpus. So I stopped trying to make the model smarter and built the layer it was missing. That's RTFM.

The solution

RTFM indexes everything. One command, one SQLite file, one retrieval layer your agent queries before grepping.

pip install rtfm-ai && cd your-project && rtfm init

30 seconds. Claude Code now searches your indexed knowledge base — code and docs and PDFs and whatever else you drop in — with full-text, semantic, or hybrid search. The agent sees 300 tokens of metadata first, then expands only what's relevant. Progressive disclosure instead of context dumps.

Free. Runs locally. No API keys. No cloud. Your data stays yours.

What it looks like

$ rtfm search "authentication flow" --limit 3
[1] src/auth/handlers.py > authenticate_user (p.2)    score 9.12
    src/auth/handlers.py:147  42 lines
[2] docs/architecture/auth.md > SSO flow (p.1)        score 7.84
    docs/architecture/auth.md:1   23 lines
[3] docs/ADR/0007-oauth.md > Decision (p.1)           score 6.90
    docs/ADR/0007-oauth.md:12  18 lines

Three results, ~300 tokens. The agent decides what to read next with rtfm_expand(source, target_section) — not a context dump, a conversation.


Quick start

Recommended — Claude Code plugin

In Claude Code (CLI or Desktop Code tab) :

/plugin marketplace add roomi-fields/rtfm
/plugin install rtfm@rtfm

That's it. The plugin auto-initializes each project on first use:

  • Creates .rtfm/library.db (one SQLite file)
  • Injects search instructions into CLAUDE.md
  • Pre-grants permission for the MCP tools (no prompt every search)
  • Indexes the project on the first prompt, re-indexes incrementally on every prompt

No pip install required. Pure Python, runs on Linux / macOS / Windows / WSL with Python 3.10+ already on PATH. The plugin bundles its own MCP server (no mcp SDK dep) and resolves python3 / python / py automatically.

Then say to Claude: "Find the authentication flow" — it uses rtfm_search instead of grepping.

Optional extras (semantic search, PDF parsing)

The core plugin is dependency-free. Heavier optional extras (embedding model, PDF parsers) install on demand into an isolated venv inside the plugin's data directory — no pollution of your system Python, no PEP 668 conflicts:

/rtfm:install-embeddings    # FastEmbed ONNX (~85 MB), semantic + hybrid search
/rtfm:install-pdf           # pdftext only (~50 MB), fast text extraction
/rtfm:install-pdf-full      # + marker-pdf + CPU-only torch (~1.5 GB), complex layouts

The pdf-full install uses PyTorch's CPU-only index (no CUDA, no GPU needed) to stay around 1.5 GB instead of 5 GB.

Restart Claude Code after install for the extras to be picked up.

Manual install (Cursor, Codex, Claude Desktop chat, other MCP clients)

For clients without Claude Code's plugin system :

pip install rtfm-ai
cd /path/to/your-project
rtfm init

Then point your MCP client at rtfm-serve (the entry exposed by the pip package). Optional extras via pip install rtfm-ai[embeddings,pdf].


How it compares

RTFMAugment CESourcegraphCode-Index-MCPMemPalace
Code indexing✅ (AST-aware)Shallow (char-chunk)
Docs, specs, markdown✅ (header-parsed)PartialLimitedVerbatim chunks
Legal / regulatory✅ (XML, BOFiP)
Research (LaTeX, PDF)
Custom parsers✅ (~50 lines)
Knowledge graph✅ (file/code links)PartialEntity graph (people)
File version history✅ (unlimited)❌ (purge-and-replace)
MCP native
Runs locallyCloudEnterprise
Open sourceMITPartialMIT
PriceFree$20-200/mo$$$/moFreeFree

RTFM is the only open-source option that indexes multi-domain content with structural parsing, a code-level knowledge graph, and unlimited per-file history. That's the niche.

Different from MemPalace specifically: MemPalace is an entity-level memory for conversations (who/project/decision triples in SQLite, plus verbatim chunks in ChromaDB). RTFM is a retrieval layer for artefacts — parsed by format, linked at the file level, versioned over time. The two are stackable, not competing.

For a deeper breakdown of the design choices behind any RAG (chunking, retrieval, augmentation, integration, freshness, storage), see RAG Fundamentals — the 6 axes →


Memory that survives sessions

Between sessions, most agents forget. RTFM indexes Claude Code's own memory files across every project on your machine, with full version history.

rtfm memory                    # Manual snapshot
rtfm memory --install-hook     # Auto-snapshot on every SessionEnd
  • Cross-project index — one DB at ~/.rtfm/memory.db sees every ~/.claude/projects/*/memory/ directory on your machine. Ask rtfm_search("OAuth auth decisions") and get hits from all 18 of your projects.
  • Unlimited version history — every change to a memory file is snapshotted (no prune). rtfm_history <slug> returns the full evolution.
  • Auto-snapshot on SessionEnd — one command installs a global Claude Code hook. Every session you close captures a new snapshot.
  • Curated, not verbatim — RTFM indexes the notes the agent already curated itself during the session (small, structured, signal-dense). Different philosophy from MemPalace, which indexes the full conversation transcripts in ChromaDB (large, noisy, needs aggressive semantic filtering).

Obsidian vault mode

RTFM is the retrieval layer for the Karpathy LLM Wiki pattern. Karpathy himself wrote: "at small scale the index file is enough, but as the wiki grows you want proper search." This is proper search.

cd /path/to/your-obsidian-vault
rtfm vault
  • Detects .obsidian/, proposes a folder → corpus mapping
  • Resolves [[wikilinks]] following Obsidian rules → stored as graph edges
  • Generates _rtfm/ with Obsidian-native navigation (index, graph with Mermaid, hubs, orphans, Dataview frontmatter)
  • Tested on a 1,700-note research vault
_rtfm/
├── index.md      # Hub: corpus list, top connected documents
├── graph.md      # Hub documents, orphans, broken links, Mermaid
├── recent.md     # Recently modified files
└── corpus/       # Per-corpus indexes

The LLM still writes your wiki. RTFM handles the retrieval that index.md can't scale to.

Full Obsidian guide →


What I measured

I ran two kinds of benchmarks. The honest picture is nuanced — retrieval helps most on tasks that are actually solvable and where the agent is spending time looking for things.

Document-heavy task: French tax article generation (B10)

Writing a ~50-page regulated article from a corpus of legal code, case law, and administrative doctrine. Same agent (Claude Code + Sonnet 4), same prompt, eight configurations tested.

ConfigurationDurationCostTokens
Baseline (no RTFM)8m 16s$22.618.21 M
With RTFM (FTS default)6m 58s$11.143.22 M

Δ : −51 % cost, −61 % tokens, −16 % duration — with better factual accuracy. This is the use case RTFM was built for: navigating a large multi-domain corpus where grep misses the right paragraph.

Code task: FeatureBench (LiberCoders dataset)

11 tasks, 3 repos of varying size, 4 conditions (A = standard prompt with file paths; B = discovery, no paths; C = RTFM FTS; D = RTFM hybrid), 3 runs each.

RepoSizeWhere RTFM helps
metaflow620 filesEveryone resolves — RTFM adds no measurable gain
astropy1,119 filesAll conditions 25–30 % F2P pass; none fully resolve
mlflow8,255 filesAll conditions 0–5 % F2P pass; none fully resolve

On a single smaller-scope run (test_stub_generator on metaflow), RTFM cut agent time by −37 % vs the no-paths baseline. On the larger repos, the tasks themselves were too hard for Sonnet 4 to resolve inside a 20-minute timeout regardless of retrieval.

The honest caveats

  • Single model (Sonnet 4), single agent (Claude Code). Not statistically bullet-proof.
  • On small repos (< 1k files), grep is enough and RTFM adds overhead.
  • FeatureBench measures code modification, not information retrieval. It's the wrong benchmark for a retrieval tool — I'm running against it because it's what exists. Better-suited benchmarks (RepoQA, SWE-QA, LocAgent) are on the roadmap.

What this says

RTFM measurably wins when the bottleneck is "find the right paragraph in a 2,000-file corpus". It doesn't magically make unsolvable tasks solvable. The model still has to do the work — RTFM just makes sure it has the right context to do it with.


Who it's for

RTFM works anywhere your project isn't just code:

  • LegalTech — Code + tax law + regulatory specs. Ships with Legifrance XML and BOFiP parsers.
  • Research — Code + LaTeX papers + datasets. Ships with LaTeX and PDF parsers.
  • FinTech — Code + financial regulations + XBRL reports. Write an XBRL parser in 50 lines.
  • HealthTech — Code + medical records (HL7/FHIR) + clinical guidelines.
  • Solo devs with big projects — Stop watching your agent grep the same 8,000 files every session.
  • Obsidian / PKM users — Make your vault actually searchable by your AI.
  • Any regulated industry — If your project mixes code with domain documents, RTFM is for you.

Full feature list

Search & retrieval

  • FTS5 full-text search — instant, zero-config, works out of the box
  • Semantic search — optional embeddings (FastEmbed/ONNX, no GPU needed)
  • Hybrid mode — combine both, rank by relevance score
  • Metadata-first — results return file paths + scores (~300 tokens), not content dumps
  • Progressive disclosure — agent expands only the chunks it actually needs
  • Knowledge graph — wikilinks + Python imports resolved as graph edges, hub detection, centrality ranking

Multi-format indexing

  • 10 parsers built-in — Markdown, Python (AST), LaTeX, YAML, JSON, Shell, PDF, XML, HTML, plain text
  • Extensible — add any format in ~50 lines of Python
  • Auto-sync hooks — index stays fresh every prompt, zero manual work
  • Incremental — only re-indexes what changed

Integration

  • Native Claude Code plugin/plugin install rtfm@roomi-fields/rtfm, auto-init per project
  • Pure-Python MCP server — 0 external deps, no mcp SDK / pydantic / native binaries
  • Cross-platform — Linux, macOS, Windows, WSL (only requires Python ≥ 3.10 on PATH)
  • 13 MCP tools — search, context, expand, graph, history, sync, tags, ...
  • Manual install fallbackpip install rtfm-ai for Cursor, Codex, Claude Desktop chat, any other MCP client
  • CLI + Python API — scriptable for pipelines
  • Non-invasive — doesn't touch your code, doesn't replace your editor

The parser architecture

Need to index a format nobody supports? Write a parser in ~50 lines.

from rtfm.parsers.base import BaseParser, ParserRegistry
from rtfm.core.models import Chunk
import json
from uuid import uuid4

@ParserRegistry.register
class FHIRParser(BaseParser):
    """Parse HL7 FHIR medical records."""
    extensions = ['.fhir.json']
    name = "fhir"

    def parse(self, path, metadata=None):
        data = json.loads(path.read_text())
        for entry in data.get('entry', []):
            resource = entry.get('resource', {})
            yield Chunk(
                id=resource.get('id', str(uuid4())),
                content=json.dumps(resource, indent=2),
                book_title=f"FHIR {resource.get('resourceType', 'Unknown')}",
                book_slug=resource.get('id', 'unknown'),
                page_start=1,
                page_end=1,
            )

Drop it in your project, restart Claude Code, your medical AI agent now understands FHIR records.

Built-in parsers

ParserExtensionsStrategy
Markdown.mdSplit by headers, YAML frontmatter extraction
Python.pyAST-based: each class/function = 1 chunk
LaTeX.texSplit by \section, \chapter, etc.
YAML.yaml, .ymlSplit by top-level keys
JSON.jsonSplit by top-level keys or array elements
Shell.sh, .bash, .zshFunction-aware chunking
PDF.pdfPage-based (pip install rtfm-ai[pdf])
Legifrance XML.xmlFrench legal codes (LEGI format)
BOFiP HTML.htmlFrench tax doctrine
Plain text.js, .ts, .rs, .go, ...Line-boundary chunks (~500 chars)

MCP tools

ToolWhat it does
rtfm_searchSearch the index (FTS, semantic, or hybrid)
rtfm_contextGet relevant context for a subject (metadata-only)
rtfm_expandShow all chunks of a source with full content
rtfm_discoverFast project structure scan (~1s, no indexing needed)
rtfm_booksList indexed documents
rtfm_statsLibrary statistics
rtfm_syncSync a directory (incremental)
rtfm_ingestIngest a single file
rtfm_tagsList all tags
rtfm_tag_chunksAdd tags to specific chunks
rtfm_removeRemove a file from the index
rtfm_graphShow dependency graph for a source (imports, links)
rtfm_historyFile version history and memory snapshots

CLI reference

# Search
rtfm search "authentication flow"
rtfm search "article 39" --corpus cgi --limit 5

# Sync
rtfm sync                              # All registered sources
rtfm sync /path/to/docs --corpus docs  # Specific directory
rtfm sync . --force                    # Force re-index

# Source management
rtfm add /path/to/docs --corpus docs --extensions md,pdf
rtfm sources

# Obsidian vault
rtfm vault                             # Initialize for cwd vault
rtfm vault /path/to/vault              # Specific vault
rtfm vault --regenerate                # Regenerate _rtfm/ files

# Cross-project Claude memory
rtfm memory                            # Manual snapshot
rtfm memory --install-hook             # Auto-snapshot on SessionEnd

# Status & info
rtfm status
rtfm books
rtfm tags
rtfm history path/to/file.md           # Memory version history

# Semantic search
rtfm embed                             # Generate embeddings (one-time)
rtfm semantic-search "tax deductions" --hybrid

# MCP server
rtfm serve

Python API

from rtfm import Library

lib = Library("my_library.db")

# Index
stats = lib.ingest("documents/article.md", corpus="docs")
result = lib.sync(".", corpus="my-project")  # SyncResult(+3 ~1 -0 =42)

# Search
results = lib.search("depreciation", limit=10, corpus="cgi")
results = lib.hybrid_search("amortissement fiscal", limit=10)

# Export for LLM
prompt_context = results.to_prompt(max_chars=8000)

lib.close()

Where RTFM fits

RTFM isn't a task manager. It's not an agent framework. It's the knowledge layer your agent needs underneath whatever you're already using.

┌─────────────────────────────────┐
│  GSD / Taskmaster / Claude Flow │  ← Orchestration
├─────────────────────────────────┤
│              RTFM               │  ← Knowledge (you are here)
├─────────────────────────────────┤
│          Claude Code            │  ← Execution
└─────────────────────────────────┘

Without RTFM, your orchestrator drives an agent that hallucinates. With RTFM, the agent knows what it's building on.


Contributing

Adding a parser is the easiest way to contribute — and the most impactful. See CONTRIBUTING.md.

Found a bug? Have an idea? Open an issue.

License

MIT — use it, fork it, extend it, ship it.

Author

Romain Peyrichou@roomi-fields


Code indexers see your code. RTFM sees everything.

⭐ Star on GitHub if RTFM saves your agent from hallucinating.

Related Servers

NotebookLM Web Importer

Import web pages and YouTube videos to NotebookLM with one click. Trusted by 200,000+ users.

Install Chrome Extension