second-brain-mcp
Self-maintaining knowledge vault: figure-level search, auto-wikilinks, and sleep-based memory compression.
second-brain MCP Server
A self-maintaining personal knowledge database β powered by MCP, DuckDB, and biological memory models.
For anyone who saves more papers, notes, and figures than they could ever re-read. second-brain turns everything you capture into a database that maintains itself β auto-linking related notes, compressing what you stop reading, and keeping every figure searchable by its content. What you saved a year ago is still one query away, at a fraction of the token cost.
Why Does This Exist?
| Problem | Solution |
|---|---|
| π You save dozens of papers but can never find the right figure | search_figures("UMAP melanocyte") β returns the exact panel, across every paper you've saved |
| π arXiv gives you the abstract; you need the full paper | Auto-upgrades /abs/ β /html/ β fetches the complete paper with all sections, not just the abstract |
| π Notes pile up; older ones never get cleaned up | Vault Sleep: low-access notes compress automatically every Sunday while you sleep (60β90% token reduction) |
| π New notes stay isolated; you forget what's connected | Auto-wikilinks: every saved note is automatically linked to semantically related notes already in your vault |
| π Semantic search needs a cloud API or Docker stack | Self-hosted nomic-embed-text via llama-server; BM25 fallback when offline |
| π Every AI memory tool locks you into their format | Pure Markdown vault β sync with Google Drive, iCloud, or git; switch agents anytime |
| πΌ Figure context is lost when you read a paper | Every figure is downloaded, OCR'd by Claude Vision, and stored in DuckDB β searchable by gene name, p-value, axis label |
The One-Command Demo
save_article("https://arxiv.org/abs/2405.01234")
β
β’ /abs/ auto-upgraded to /html/ β full paper, not just abstract
β’ Full text converted to Markdown
β’ All figures downloaded + OCR'd by Claude Vision
β’ Semantic embeddings computed
β’ Auto-linked to related notes already in your vault β auto-wikilinks
β’ Stored in 30-resources/ β queryable immediately
search_figures("UMAP cluster batch correction")
β
β’ Returns the exact figure from the exact paper
β’ Works across your entire saved literature library
What Makes It Different
flowchart LR
subgraph input["π₯ Any Content Source"]
A1["arXiv / PubMed paper"]
A2["Web article / blog"]
A3["Local PDF / DOCX"]
A4["Personal note"]
end
subgraph core["βοΈ second-brain-mcp"]
B1["Markdown note<br/>30-resources/"]
B2["Figure OCR<br/>+ VLM description"]
B3["Semantic embedding<br/>+ auto-wikilinks"]
B4["Ebbinghaus score<br/>ranking"]
B5["PNG snapshots<br/>60β90% token reduction"]
end
subgraph query["π Queryable Knowledge"]
C1["search_figures<br/>'UMAP melanocyte'"]
C2["search_notes<br/>'batch correction scRNA'"]
C3["get_context<br/>top-20 relevant notes"]
end
input --> core
B1 --> B2
B1 --> B3
B3 --> B4
B4 --> B5
B2 --> C1
B3 --> C2
B4 --> C3
Eight things most self-hosted memory tools can't do β combined in one:
| Most memory tools⦠| second-brain |
|---|---|
| Save a link or PDF, then leave you to read and tag it | π¬ One command builds the database β save_article fetches any URL/PDF, converts to Markdown, downloads & OCRs every figure with Claude Vision, then semantic-indexes it |
| Store the arXiv abstract you pasted | π Full text, not abstracts β /abs/ URLs auto-upgrade to /html/ for the complete paper: methods, results, discussion |
| Leave new notes isolated until you tag them | π The knowledge graph builds itself β every note is auto-linked to semantically related notes already in your vault |
| Cost the same whether a note is read daily or never | π§ Memory that forgets like a brain β Ebbinghaus score ranks by recency Γ frequency; stale notes compress while you sleep |
| Search documents, not what's inside the figures | πΌ Figure-level search across your whole library β search_figures("p < 0.001") returns the exact panel from the exact paper |
| Forget your project decisions between sessions | π The AI learns your rules β hot notes auto-extract constraints into memory/rules.md, injected at every session start |
| Grow more expensive as the vault grows | π Token cost shrinks with age β PNG snapshots replace old text at 60β90% compression; frequently-read papers stay full-fidelity |
| Lock you into their database format | π Zero lock-in β pure Markdown, any MCP agent, sync via any cloud drive or git |
Cross-Session Continuity β Pick Up Where You Left Off
Every project you work on can be resumed in a new session with full context β no re-explaining, no lost progress.
flowchart LR
A["π’ Session Start<br/>get_context()"] --> B["AI receives:<br/>β’ goals.md β current priorities<br/>β’ Top-20 recent notes<br/>β’ Extracted rules"]
B --> C["Work on project<br/>new_note / search / read"]
C --> D["π΄ Before ending session<br/>update_goals(...)"]
D --> E["New session<br/>get_context() again"]
E --> B
How It Works in Practice
End of session β tell the agent to save state:
Update goals: currently working on the scRNA batch correction pipeline.
Completed: harmony integration. Blocked on: choosing n_components for PCA.
Next session: start from the PCA parameter sweep in 20-areas/research/harmony-notes.md
The agent calls update_goals() and optionally new_note("project", ...) for detailed progress.
Start of next session β just say:
Get context and continue where we left off.
The agent calls get_context() and immediately sees:
goals.mdwith the state you saved- The harmony-notes.md surfaced at the top (recently accessed, high Ebbinghaus score)
- Rules auto-extracted from that note, e.g.:
RULE: use n_components=30 for this dataset β tested 20/30/50, 30 minimises batch effect without losing resolution
RULE: exclude sample CRC_04 β library size outlier confirmed by QC
These rules live in memory/rules.md and are injected at every get_context() call β the AI carries your hard-won decisions forward automatically, without you having to repeat them.
What Gets Persisted
| What | Where | Always in context? |
|---|---|---|
| Current priorities / blocked items | memory/goals.md | β every session |
| Project progress notes | 10-projects/ or 20-areas/ | β if recently accessed |
| Decisions and rationale | decisions/ | via get_decisions() |
| Extracted rules from notes | memory/rules.md | β every session |
| Saved papers and figures | 30-resources/ | via search_notes/figures |
This works across any project β bioinformatics analysis, coding, writing, research. Save state with one sentence at the end of a session; resume instantly at the start of the next.
Example Queries
# Resume a project from last session
get_context() # β goals + recent notes + rules loaded automatically
# Find a specific figure panel across all saved papers
search_figures("p < 0.001 UMAP cluster")
# Semantic search across all notes
search_notes("single cell integration batch correction")
# Decision records for a specific project
get_decisions("MyProject")
Memory Architecture β Biological Analogy
| Biological Brain | This System |
|---|---|
| Hippocampal consolidation during sleep | Vault Sleep: weekly LLM-compression of old low-access notes |
| Ebbinghaus forgetting curve | Score-based ranking: access_count / ln(age_days) |
| Visual long-term memory | PNG snapshots β resolution degrades gracefully with age |
| Associative recall | Semantic search + auto-generated [[wikilinks]] |
| Sleep-dependent consolidation | launchd cron, runs Sunday 02:00 while you sleep |
Token Efficiency
Memory that gets cheaper over time β unlike flat-file systems where old notes cost the same forever.
Note age β fresh (0β3 mo) 3β6 months 6β12 months 1 year+
ββββββββββββββ ββββββββββ βββββββββββ βββββββ
token cost: ββββββββββββββ ββββββ ββββ ββ
~1,000 tokens ~400 tokens ~256 tokens ~100 tokens
βΌ 60% βΌ 74% βΌ 90%
Tier assigned by score Γ age (adaptive). Frequently-accessed notes stay full-text regardless of age.
Search Performance
Measured on Apple Silicon MacBook (20-rep average, BM25-only mode).
Vault BM25-only p50 Hybrid BM25+semantic p50
ββββββ βββββββββββββββββ ββββββββββββββββββββββββ
10 n βββββββββ 21 ms ββββββββββββ 37 ms
50 n βββββββββ 25 ms βββββββββββββ 39 ms
100 n βββββββββ 27 ms ββββββββββββββ 45 ms
| Vault Size | BM25 p50 | Hybrid p50 | Recall@1 | Recall@5 | MRR |
|---|---|---|---|---|---|
| 10 notes | 21 ms | 37 ms | 30% | 60% | 0.42 |
| 50 notes | 25 ms | 39 ms | 70% | 90% | 0.78 |
| 100 notes | 27 ms | 45 ms | 70% | 80% | 0.73 |
Hybrid mode adds ~18 ms for embedding lookup. Both modes scale sub-linearly with vault size.
Recall figures at this scale (10β100 notes) carry high sample variance β a single ambiguous query shifts Recall@1 by 10%. Treat them as directional, not as benchmarks against large corpora; the takeaway is that hybrid consistently beats BM25-only on relevance for a fixed query set.
System Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI Agent Layer β
β Claude Code Β· Gemini CLI Β· Any MCP β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β MCP Protocol (19 tools)
ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββ
β Layer 2 β MCP Server β
β server.py β
β get_context Β· search_notes Β· save_article Β· β¦ β
ββββββββ¬ββββββββββββββββ¬βββββββββββββββββ¬ββββββββββββββ
β β β
ββββββββΌβββββββ ββββββββΌβββββββ ββββββββΌβββββββ
β vault_sleepβ β vault_db β β figures β
β compress β β DuckDB FTS β β PNG snap β
β Phase 3β9 β β + semantic β β OCR Β· VLM β
ββββββββ¬βββββββ ββββββββ¬βββββββ βββββββββββββββ
β β
ββββββββΌββββββββββββββββΌβββββββββββββββββββββββββββββββ
β Layer 0 β Markdown Vault β
β 00-inbox Β· 10-projects Β· 20-areas Β· 30-resources β
β 40-archive Β· decisions Β· memory Β· templates β
β (syncs via Google Drive / iCloud / git) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Vault Sleep β Auto-compression Flow
Every Sunday 02:00 (launchd, no interaction needed)
β
βΌ
sync_index + embeddings
β
βΌ age > 90d AND Ebbinghaus score β€ 0.5
ββββββββββββββββββββββββββββββββββββββββ
β Adaptive Tier Selection β
β score > 1.5 β text (keep full) β β frequently-read: never compressed
β score > 0.8 β large ~400 tokens β
β score > 0.3 β base ~256 tokens β
β otherwise β small ~100 tokens β
ββββββββββββββββββ¬ββββββββββββββββββββββ
β
Gemini CLI β Claude CLI β naive (auto-fallback, no LLM required)
β
compressed β vault / original β 40-archive/ / snapshot β .png
MCP Tools (19 total)
| Tool | Description |
|---|---|
get_context | Session start: goals + top-20 Ebbinghaus-ranked notes + auto-rules |
save_article | Fetch URL/PDF β Markdown + auto-extract figures |
search_notes | Hybrid BM25 + semantic search across all notes |
search_figures | Search figure OCR text / VLM descriptions |
extract_figures_for | Manually trigger figure extraction for a saved article |
read_note | Read note + record access (updates Ebbinghaus score) |
read_note_as_image | Return PNG snapshot for token-efficient reading |
new_note | Create note with correct template and folder by type |
get_decisions | List ADR decision records, optionally filtered by project |
update_goals | Update memory/goals.md |
sync_index | Rebuild DuckDB index from vault files |
index_stats | Show note counts by type |
vault_sleep | Compress old low-activity notes (dry_run=True by default) |
sleep_status | Show compression candidates without acting |
snapshot_note_tool | Render note to PNG at chosen resolution tier |
extract_rules_tool | Extract L3 rules from frequently-accessed notes |
consolidate_tool | Merge semantically similar notes into one abstract note |
update_links_tool | Refresh auto-generated [[wikilinks]] |
prune_archive_tool | Delete archived originals that have a PNG snapshot |
Test Results
tests/test_figures.py 19 passed (OCR, snapshots, VLM)
tests/test_server.py 13 passed (MCP tools, path safety)
tests/test_vault_db.py 39 passed (FTS, semantic search, embeddings)
tests/test_vault_sleep.py 44 passed (compression, consolidation, rules, prune)
ββββββββββββββββββββββββββββββββββββββββ
115 passed in 3.37s
Installation
Prerequisites
| Dependency | Required | Notes |
|---|---|---|
| Python 3.11+ | β | |
| uv | β | Package manager |
| Playwright | β | PNG snapshot rendering |
| llama-server | Optional | Semantic search; BM25 fallback if absent |
| nomic-embed-text-v1.5.Q8_0.gguf | Optional | ~300 MB embedding model |
Gemini CLI or ANTHROPIC_API_KEY | Optional | Better compression quality; naive fallback if absent |
Quick Start (PyPI β recommended)
Step 1 β Install
pip install mcp-second-brain
playwright install chromium
Step 2 β Create your vault
mkdir -p ~/second-brain/{00-inbox,10-projects,20-areas,30-resources,40-archive,decisions,memory,templates}
Step 3 β Register with your AI agent
Option A: Claude Code (CLI)
claude mcp add --scope user second-brain \
--env SECOND_BRAIN_PATH=~/second-brain \
-- python -m mcp_second_brain
Option B: Claude Desktop β add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"second-brain": {
"command": "python",
"args": ["-m", "mcp_second_brain"],
"env": { "SECOND_BRAIN_PATH": "/path/to/your/vault" }
}
}
}
Step 4 β Index your vault
In Claude Code or Claude Desktop, tell the agent:
Run sync_index to build the initial index.
Development Install (clone)
git clone https://github.com/ddmanyes/second-brain-mcp
cd second-brain-mcp
uv sync
uv run playwright install chromium
Then register with Claude Code:
claude mcp add --scope user second-brain \
--env SECOND_BRAIN_PATH=~/second-brain \
-- uv run --project /path/to/second-brain-mcp python server.py
Environment Variables
| Variable | Default | Description |
|---|---|---|
SECOND_BRAIN_PATH | ~/second-brain | Path to your vault directory |
EMBED_URL | http://localhost:11435/v1/embeddings | Embedding server endpoint |
EMBED_MODEL | nomic-embed-text | Embedding model name |
EMBED_PORT | 11435 | llama-server port |
Auto-start (macOS, optional)
# Embedding server β always on, restarts on crash
cp examples/launchd/com.yourname.llama-embed.plist ~/Library/LaunchAgents/
# Edit paths inside the file, then:
launchctl load ~/Library/LaunchAgents/com.yourname.llama-embed.plist
# Weekly vault maintenance β every Sunday 02:00
cp examples/launchd/com.yourname.vault-sleep.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.yourname.vault-sleep.plist
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| Semantic search silently falls back to BM25 | llama-server not running on EMBED_PORT | Start the embedding server (see Auto-start); verify with curl localhost:11435/v1/embeddings |
read_note_as_image / snapshots fail | Playwright chromium not installed | uv run playwright install chromium |
vault_sleep never compresses anything | No Gemini CLI / ANTHROPIC_API_KEY β naive fallback, or no eligible notes | Install Gemini CLI or export ANTHROPIC_API_KEY; remember only notes >90 days old with Ebbinghaus score β€ 0.5 are candidates (sleep_status shows them) |
| Agent sees no notes / empty results | Index not built | Run sync_index once after install (and after bulk file changes) |
| Notes land in the wrong place | SECOND_BRAIN_PATH unset or wrong | Set it in your MCP config env block; defaults to ~/second-brain |
| Tools unavailable when working in other project folders | Installed as local config instead of user scope | Re-register with --scope user: claude mcp remove second-brain -s local && claude mcp add --scope user second-brain ... |
Vault Structure
vault/
βββ 00-inbox/ # Unprocessed captures β clear daily
βββ 10-projects/ # Active projects
βββ 20-areas/
β βββ research/ # Ongoing research domains
β βββ coding/ # Dev tools and workflows
β βββ consolidated/ # Auto-merged similar notes (Phase 8)
βββ 30-resources/ # β Papers and articles (save_article writes here)
βββ 40-archive/ # Compressed originals (auto-managed by vault_sleep)
βββ decisions/ # Architecture Decision Records (ADR format)
βββ memory/
β βββ goals.md # Current priorities β injected at every session start
β βββ index.md # Vault map
β βββ rules.md # Auto-extracted L3 rules β injected at every session start
βββ templates/ # Note templates (note, decision, project, research)
Running Tests
uv run pytest tests/ -v
uv run python benchmark.py --quick --markdown # search latency + accuracy report
References & Acknowledgements
Papers That Directly Inspired This Project
| Paper | Where Used |
|---|---|
| Do Language Models Need Sleep? Offline Recurrence for Improved Online Inference (2026) | Phase 3 Vault Sleep β hippocampal replay as batch memory consolidation |
| Experience Compression Spectrum: Unifying Memory, Skills, and Rules in LLM Agents (2026) | Phase 9 adaptive tier β score Γ age dual-axis; addresses the "missing diagonal" in existing systems |
| DeepSeek-OCR: Contexts Optical Compression (2025) | Phase 4 PNG tiers β image as compressed medium, 10Γ compression at 97% fidelity |
| MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning (2026) | Phase 4 vision API β Playwright render β VLM reading pipeline |
| Active Context Compression: Autonomous Memory Management in LLM Agents (2026) | Phase 3 design comparison β session-level vs. nightly batch consolidation |
| SimpleMem: Efficient Lifelong Memory for LLM Agents (2026) | Phase 8 consolidation β 3-stage semantic compression, 30Γ token reduction |
| Memory for Autonomous LLM Agents: Mechanisms, Evaluation, and Emerging Frontiers (2026) | Architecture positioning β mechanisms, evaluation, and frontiers |
Cognitive Science Foundations
- Ebbinghaus, H. (1885). Γber das GedΓ€chtnis. β forgetting curve; basis for
access_count / ln(age_days + 1) - Stickgold, R. (2005). Nature, 437, 1272β1278. β sleep-dependent memory consolidation
Built With
MarkItDown Β· DuckDB Β· llama.cpp Β· nomic-embed-text Β· FastMCP Β· Playwright Β· Anthropic Claude API
Contributing
PRs and Issues welcome. Please open an issue first to discuss significant changes.
License
MIT License β Β© 2026 Chan Chi Ru. See LICENSE.
Related Servers
Scite
Answers grounded in science
Personal Memory MCP Server
A TypeScript and SQLite-based server enabling AI to remember personal data for personalized communication.
supOS MCP Server
Provides access to supOS open APIs for querying topic structures, real-time and historical data, and executing SQL queries.
USDA Nutrition MCP Server
Access nutrition information for over 600,000 foods from the USDA FoodData Central database.
Nile Postgres
Manage and query databases, tenants, users, auth using LLMs
MCP Database Server
An MCP server that enables LLMs to interact with databases like MongoDB using natural language.
mcp-dataverse
Microsoft Dataverse MCP server: 54 tools for CRUD, FetchXML, metadata, audit, batch, solutions and more.
SQL Builder AI MCP
AI-powered SQL query builder β natural language to SQL, schema introspection, query optimization, multi-dialect support by MEOK AI Labs
CData Google Sheets MCP Server
A read-only MCP server for Google Sheets, enabling LLMs to query live data using the CData JDBC Driver.
MCP Vertica
A server for managing and querying Vertica databases, including connection, schema, and security management.