Open Archives

MCP server for the Open Archives genealogical search engine.

Open Archives MCP Server

Production-grade hybrid MCP + HTTP + SSE server generated from the Open Archives OpenAPI specification. Covers genealogical records (births, deaths, marriages, censuses), archive statistics, historical weather, and full-text page transcriptions of historical documents.

OpenAPI source used to generate tools:

../api/openapi.yaml   (local)
https://api.openarchieven.nl/openapi.yaml   (remote)

Overview

A schema-aware server that automatically converts the OpenAPI specification into callable tools and exposes them through multiple transports:

  • MCP Remote (JSON-RPC over StreamableHTTP)
  • HTTP JSON API
  • SSE streaming with auto-pagination
  • Chunked HTTP streaming with auto-pagination
  • Redis caching (optional)
  • Health checks

Use with Claude

Add as a custom connector

A hosted endpoint is available — no installation required.

In claude.ai or Claude Desktop:

  1. Open Settings → Connectors.
  2. Click Add custom connector.
  3. Enter the URL: https://mcp.openarchieven.nl/
  4. Save and approve when prompted.

No authentication is required — Open Archives is a public dataset.

Example queries

Once the connector is added you can ask Claude, for example:

  • "Who are the ancestors of Johannes Gregorius Marinus Coret? Give me an overview, including source citations in markdown format with the links to the original archives if possible, otherwise provide the Open Archieven links and provide a tree in SVG."
  • "Did Johannes Coret and Antonia Uphus have descendants? Give me an overview, including source citations in markdown format with the links to the original archives if possible, otherwise provide the Open Archieven links, include thumbnails from scans from archival sources if available and provide a tree in SVG."
  • "Provide me with a list of sourcetypes per archive(name) where I can find information about the Coret family. Show the result in a markdown document including links to the search pages on Open Archieven. Instead of the archive code use the ISIL if available."
  • "What was the weather in Amsterdam on 1953-02-01?"
  • "What does the 1850 census say about Utrecht?"

Claude will call the matching tool (search_records, show_record, get_marriages, get_historical_weather, get_census_data, …) and return links to the corresponding record pages on https://www.openarchieven.nl.

Self-hosted (stdio)

If you prefer running the server locally as a stdio MCP server:

npx -y @coret/openarchieven-mcp-server

Core Features

OpenAPI Auto-Generation

Every API operation becomes a tool automatically via generate.ts.

All 21 operations:

Tool NameDescription
search_recordsSearch genealogical records
show_recordShow a single genealogical record
match_recordMatch a person to birth and death records
get_births_years_agoList births from N years ago
get_birthsFind birth records
get_deathsFind death records
get_marriagesFind marriage records
get_archivesList all archives with statistics
get_record_statsRecord count per archive
get_source_type_statsRecord count per source type
get_event_type_statsRecord count per event type
get_comment_statsComment count statistics
get_family_name_statsFamily name frequency
get_first_name_statsFirst name frequency
get_profession_statsProfession frequency
get_breakdownCross-tabulation grouped by archive, source type, event type, place or year
get_historical_weatherHistorical weather from KNMI
get_census_dataDutch census data 1795–1899
search_transcriptionsFull-text search across page transcriptions of historical documents
browse_transcriptionsHierarchically browse transcriptions by source archive, archive number or inventory
show_transcriptionRetrieve a single page transcription by id

Note: The callback (JSONP) parameter present in the upstream API is excluded from all tools — it is irrelevant in an MCP/JSON-RPC context.


Schema-Perfect Validation

Uses actual OpenAPI parameter schemas. Validates:

  • required parameters
  • integer fields
  • number fields
  • enum values
  • minimum / maximum constraints

Multiple Interfaces

MCP Remote (StreamableHTTP)

POST /       ← canonical public endpoint (mcp.openarchieven.nl)
POST /mcp    ← local / legacy alias

Stateless JSON-RPC transport — a new MCP server instance is created per request.

Origin validation: Browser requests must come from claude.ai, claude.com, or any domain listed in ALLOWED_ORIGINS. Requests with no Origin header (native MCP clients, curl, server-to-server) are accepted. Unknown origins receive HTTP 403.

HTTP JSON

GET  /tools
POST /tools/:name

SSE Streaming (auto-paginating)

GET /events/:name

Chunked HTTP Streaming (auto-paginating)

POST /stream/:name

Discovery Metadata (/.well-known)

Static, hand-editable JSON files served verbatim from well-known/:

GET /.well-known/mcp/server-card.json   ← SEP-1649 MCP Server Card
GET /.well-known/mcp.json               ← alias of the server card
GET /.well-known/agent-card.json        ← A2A v0.3 Agent Card
GET /.well-known/agent.json             ← alias of the agent card

Edit well-known/mcp-server-card.json and well-known/agent-card.json directly — no restart required (files are read on each request). Responses are sent with Content-Type: application/json; charset=utf-8 and Cache-Control: public, max-age=3600.


Pagination

Streaming endpoints (/events/:name, /stream/:name) automatically paginate through results for endpoints that support a start offset:

  • Increments start by number_show per page
  • Stops when results are exhausted or after 20 pages (safety cap)
  • SSE sends a : heartbeat comment every 10 seconds to keep connections alive

Redis Cache

Optional Redis support.

If Redis is running, upstream responses are cached with a per-tool TTL tuned to the data's volatility:

CategoryTTLTools
Immutable historicalnever expiresget_historical_weather, get_census_data
Slowly-changing metadata7 daysget_archives
Stats aggregations1 dayget_record_stats, get_source_type_stats, get_event_type_stats, get_comment_stats, get_family_name_stats, get_first_name_stats, get_profession_stats, get_breakdown
Individual record lookups1 dayshow_record, show_transcription
Search-style6 hourssearch_records, match_record, get_births, get_deaths, get_marriages, search_transcriptions, browse_transcriptions
Date-boundnext UTC midnightget_births_years_ago

CACHE_TTL (default 3600) is the fallback for any tool not in the map above.

If Redis is unavailable:

  • server still runs normally (degraded mode)

Rate Limiting

The upstream API enforces 4 requests per second per IP. The server queues all upstream calls through a token-bucket rate limiter (configurable via RATE_LIMIT_RPS).


Health Checks

GET /health

Project Files

generate.ts
server.ts
tsconfig.json
package.json
.env.example
generated/
  tools.json
  spec.json

Requirements

  • Node.js 18+
  • npm
  • optional Redis server

Configuration

Copy .env.example to .env and adjust:

cp .env.example .env
VariableDefaultDescription
PORT3001HTTP port
OPENAPI_PATH../api/openapi.yamlPath or URL to OpenAPI spec
UPSTREAM_BASEhttps://api.openarchieven.nl/1.1Upstream API base URL
RATE_LIMIT_RPS4Upstream requests per second
REDIS_URLredis://localhost:6379/5Redis connection URL (db 5)
CACHE_TTL3600Fallback cache TTL in seconds (used for tools not in the per-tool map; see Redis Cache)
LOG_LEVELinfotrace debug info warn error fatal
NODE_ENV(unset)Set to production for JSON logs (default: pretty-printed)
ALLOWED_ORIGINS(empty)Extra Origin headers allowed on the MCP endpoint (comma-separated). Claude domains and requests without an Origin header are always allowed.

Install

npm install

Generate Tools from OpenAPI YAML

Run from local spec:

npx tsx generate.ts

Or from remote URL:

npx tsx generate.ts https://api.openarchieven.nl/openapi.yaml

Expected result:

Generated 21 tools
Output: generated/tools.json, generated/spec.json

Creates:

generated/tools.json
generated/spec.json

Start Server

npx tsx server.ts

Expected startup (development — pretty-printed):

[12:00:00] INFO: Open Archieven MCP server started
    port: 3001
    tools: 21
    upstream: "https://api.openarchieven.nl/1.1"
    rateLimit: "4 req/s"
    redis: "redis://localhost:6379/5"
    env: "development"

In production (NODE_ENV=production) each log line is a single JSON object.

Server binds to:

http://0.0.0.0:3001

Test All Features


1. Health Check

curl http://localhost:3001/health

Expected:

{
  "ok": true,
  "tools": 21,
  "redis": false,
  "uptime": 1.23
}

2. List Tools

curl http://localhost:3001/tools

Expected:

[
  "search_records",
  "show_record",
  "match_record",
  "get_births_years_ago",
  "get_births",
  "get_deaths",
  "get_marriages",
  "get_archives",
  "get_record_stats",
  "get_source_type_stats",
  "get_event_type_stats",
  "get_comment_stats",
  "get_family_name_stats",
  "get_first_name_stats",
  "get_profession_stats",
  "get_historical_weather",
  "get_census_data",
  "search_transcriptions",
  "browse_transcriptions",
  "show_transcription"
]

3. Tool Call

curl -X POST http://localhost:3001/tools/search_records \
-H "Content-Type: application/json" \
-d '{"name":"Coret"}'

4. Show a Single Record

curl -X POST http://localhost:3001/tools/show_record \
-H "Content-Type: application/json" \
-d '{"archive":"hua","identifier":"E13B9821-C0B0-4AED-B20B-8DE627ED99BD"}'

5. SSE Streaming

curl -N "http://localhost:3001/events/search_records?name=Coret"

Expected stream:

event: page
data: {...}

event: page
data: {...}

event: done
data: {}

6. Heartbeat Test

Leave SSE open for 15+ seconds — expect periodic keep-alive lines:

: heartbeat

7. Chunked HTTP Streaming

curl -N -X POST http://localhost:3001/stream/search_records \
-H "Content-Type: application/json" \
-d '{"name":"Coret"}'

Expected (newline-delimited JSON):

{"query":{...},"response":{"number_found":...,"docs":[...]}}
{"query":{...},"response":{"number_found":...,"docs":[...]}}

8. MCP Initialize

curl -X POST http://localhost:3001/ \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-03-26",
    "capabilities": {},
    "clientInfo": { "name": "test", "version": "1.0" }
  }
}'

9. MCP List Tools

curl -X POST http://localhost:3001/ \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tools/list"
}'

10. MCP Call Tool

curl -X POST http://localhost:3001/ \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "search_records",
    "arguments": { "name": "Coret" }
  }
}'

Redis Testing

Start Redis

redis-server

Restart the MCP server. Expected in /health:

{ "redis": true }

Without Redis

Stop Redis and restart. Expected:

{ "redis": false }

Common Commands

Regenerate after API changes

npx tsx generate.ts

Restart server

npx tsx server.ts

Run tests

npm test

Tests cover the per-tool TTL strategy (cache-ttl.ts) using Node's built-in test runner — no extra dependencies. The coverage test asserts every tool emitted by generate.ts has an explicit TTL entry, so re-running npm run generate after an upstream OpenAPI change will surface any new tool that needs a TTL decision.


Troubleshooting

Generated files missing

npx tsx generate.ts

Port already in use

Linux / macOS:

lsof -i :3001
kill -9 <PID>

Windows:

netstat -ano | findstr :3001
taskkill /PID <PID> /F

Redis not connecting

Server runs normally without Redis. Check REDIS_URL in .env.

Rate limit errors (429)

The upstream API allows 4 req/s per IP. The built-in rate limiter queues requests automatically. If you are running multiple server instances, reduce RATE_LIMIT_RPS or use a shared queue.


Privacy Policy

This server is a thin proxy over the public Open Archives API. It does not require user authentication and does not collect personal data of its own.

The full privacy policies of the operators apply in addition to this section:

Data collection

  • Tool arguments (e.g. a search name, an archive code, a record identifier) are received from the MCP client.
  • HTTP request metadata — method, path, status code, latency, and the source IP — is observed by the reverse proxy in front of the hosted endpoint at mcp.openarchieven.nl.
  • No accounts, cookies, tokens, or session identifiers are collected. The server is anonymous-by-design.

Use and storage

  • Tool arguments are forwarded verbatim over HTTPS to https://api.openarchieven.nl/1.1 to fulfill the request, and the upstream response is returned to the caller.
  • Application logs (tool name, arguments, status, latency) are written to stdout via pino. On the hosted endpoint these logs are ephemeral: they are not written to disk and are lost on process restart. Set LOG_LEVEL=warn to suppress argument logging.
  • Cache (optional): when Redis is configured, upstream responses are cached under keys of the form mcp:<tool>:<sorted-params-json>. The cache contains response bodies only; no user identifiers are stored.

Third-party sharing

No data is sent to any service other than the upstream Open Archives API listed above. There are no analytics, telemetry, advertising, or observability third parties involved.

Data retention

DataRetention
Tool arguments and responsesNot persisted by the application
Application logsEphemeral (stdout, lost on restart)
Redis cache entriesPer-tool TTL (6 hours – 7 days; immutable historical lookups never expire). See Redis Cache.
Reverse-proxy access logsPer the hosting provider's standard retention policy

Security

The MCP endpoint validates the Origin header on every request and rejects unknown browser origins (DNS-rebinding defense). All transport is over HTTPS.

External links surfaced to clients

Tool responses include URLs that point to record pages on https://www.openarchieven.nl. The submission declares the following allowed link URI so users are not prompted to confirm each link:

  • https://www.openarchieven.nl

Contact

For privacy questions or requests, contact:


Recommended Production Upgrades

  • HTTPS reverse proxy (nginx / caddy)
  • PM2 or systemd process manager
  • Structured JSON logging (pino / winston)
  • Request tracing (OpenTelemetry)
  • Auth middleware if server is public-facing
  • Shared Redis for multi-instance deployments

Version

v1.0

Schema-perfect OpenAPI-generated MCP server for Open Archives.

เซิร์ฟเวอร์ที่เกี่ยวข้อง

NotebookLM Web Importer

นำเข้าหน้าเว็บและวิดีโอ YouTube ไปยัง NotebookLM ด้วยคลิกเดียว ผู้ใช้กว่า 200,000 คนไว้วางใจ

ติดตั้งส่วนขยาย Chrome