LanceDB
Interact with on-disk documents using agentic RAG and hybrid search via LanceDB.
ποΈ LanceDB MCP Server for LLMS
A Model Context Protocol (MCP) server that enables LLMs to interact directly the documents that they have on-disk through agentic RAG and hybrid search in LanceDB. Ask LLMs questions about the dataset as a whole or about specific documents.
β¨ Features
- π LanceDB-powered serverless vector index and document summary catalog.
- π Efficient use of LLM tokens. The LLM itself looks up what it needs when it needs.
- π Security. The index is stored locally so no data is transferred to the Cloud when using a local LLM.
π Quick Start
To get started, create a local directory to store the index and add this configuration to your Claude Desktop config file:
MacOS: ~/Library/Application\ Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json
{
"mcpServers": {
"lancedb": {
"command": "npx",
"args": [
"lance-mcp",
"PATH_TO_LOCAL_INDEX_DIR"
]
}
}
}
Prerequisites
- Node.js 18+
- npx
- MCP Client (Claude Desktop App for example)
- Summarization and embedding models installed (see config.ts - by default we use Ollama models)
ollama pull snowflake-arctic-embed2
ollama pull llama3.1:8b
Demo
Local Development Mode:
{
"mcpServers": {
"lancedb": {
"command": "node",
"args": [
"PATH_TO_LANCE_MCP/dist/index.js",
"PATH_TO_LOCAL_INDEX_DIR"
]
}
}
}
Use npm run build
to build the project.
Use npx @modelcontextprotocol/inspector dist/index.js PATH_TO_LOCAL_INDEX_DIR
to run the MCP tool inspector.
Seed Data
The seed script creates two tables in LanceDB - one for the catalog of document summaries, and another one - for vectorized documents' chunks. To run the seed script use the following command:
npm run seed -- --dbpath <PATH_TO_LOCAL_INDEX_DIR> --filesdir <PATH_TO_DOCS>
You can use sample data from the docs/ directory. Feel free to adjust the default summarization and embedding models in the config.ts file. If you need to recreate the index, simply rerun the seed script with the --overwrite
option.
Catalog
- Document summary
- Metadata
Chunks
- Vectorized document chunk
- Metadata
π― Example Prompts
Try these prompts with Claude to explore the functionality:
"What documents do we have in the catalog?"
"Why is the US healthcare system so broken?"
π Available Tools
The server provides these tools for interaction with the index:
Catalog Tools
catalog_search
: Search for relevant documents in the catalog
Chunks Tools
chunks_search
: Find relevant chunks based on a specific document from the catalogall_chunks_search
: Find relevant chunks from all known documents
π License
This project is licensed under the MIT License - see the LICENSE file for details.
Related Servers
Database Updater
Update various databases (PostgreSQL, MySQL, MongoDB, SQLite) using data from CSV and Excel files.
Stellar MCP
Interact with the Stellar blockchain, manage accounts, and execute smart contracts on Stellar Classic and Soroban.
Bitable
Interact with Lark Bitable tables and data using the Model Context Protocol.
Postgres MCP Pro
An MCP server for PostgreSQL providing index tuning, explain plans, health checks, and safe SQL execution.
PostgreSQL MCP Server
An MCP server that provides tools to interact with PostgreSQL databases.
GigAPI Timeseries Lake
An MCP server for GigAPI Timeseries Lake, enabling seamless integration with MCP-compatible clients.
Airtable
Interact with Airtable's API to manage bases, tables, and records.
Memory-Plus
a lightweight, local RAG memory store to record, retrieve, update, delete, and visualize persistent "memories" across sessionsβperfect for developers working with multiple AI coders (like Windsurf, Cursor, or Copilot) or anyone who wants their AI to actually remember them.
FinanceMCP
Provides real-time financial data using the Tushare API.
Enhanced Medication Information MCP Server
Provides real-time access to FDA drug data, including shortages, labeling, and recalls, via the openFDA API.