QDrant Loader
A toolkit for loading data into the Qdrant vector database, supporting AI-powered development workflows.
QDrant Loader
📋 Changelog v0.8.0 - Latest improvements and bug fixes
A comprehensive toolkit for loading data into Qdrant vector database with advanced MCP server support for AI-powered development workflows.
🎯 What is QDrant Loader?
QDrant Loader is a data ingestion and retrieval system that collects content from multiple sources, processes and vectorizes it, then provides intelligent search capabilities through a Model Context Protocol (MCP) server for AI development tools.
Perfect for:
- 🤖 AI-powered development with Cursor, Windsurf, and other MCP-compatible tools
- 📚 Knowledge base creation from technical documentation
- 🔍 Intelligent code assistance with contextual information
- 🏢 Enterprise content integration from multiple data sources
📦 Packages
This monorepo contains three complementary packages:
🔄 QDrant Loader
Data ingestion and processing engine
Collects and vectorizes content from multiple sources into QDrant vector database.
Key Features:
- Multi-source connectors: Git, Confluence (Cloud & Data Center), JIRA (Cloud & Data Center), Public Docs, Local Files
- File conversion: PDF, Office docs (Word, Excel, PowerPoint), images, audio, EPUB, ZIP, and more using MarkItDown
- Smart chunking: Modular chunking strategies with intelligent document processing and hierarchical context
- Incremental updates: Change detection and efficient synchronization
- Multi-project support: Organize sources into projects with shared collections
- Provider-agnostic LLM: OpenAI, Azure OpenAI, Ollama, and custom endpoints with unified configuration
⚙️ QDrant Loader Core
Core library and LLM abstraction layer
Provides the foundational components and provider-agnostic LLM interface used by other packages.
Key Features:
- LLM Provider Abstraction: Unified interface for OpenAI, Azure OpenAI, Ollama, and custom endpoints
- Configuration Management: Centralized settings and validation for LLM providers
- Rate Limiting: Built-in rate limiting and request management
- Error Handling: Robust error handling and retry mechanisms
- Logging: Structured logging with configurable levels
🔌 QDrant Loader MCP Server
AI development integration layer
Model Context Protocol server providing search capabilities to AI development tools.
Key Features:
- MCP Protocol 2025-06-18: Latest protocol compliance with dual transport support (stdio + HTTP)
- Advanced search tools: Semantic search, hierarchy-aware search, attachment discovery, and conflict detection
- Cross-document intelligence: Document similarity, clustering, relationship analysis, and knowledge graphs
- Streaming capabilities: Server-Sent Events (SSE) for real-time search results
- Production-ready: HTTP transport with security, session management, and health checks
🚀 Quick Start
Installation
# Install both packages
pip install qdrant-loader qdrant-loader-mcp-server
# Or install individually
pip install qdrant-loader # Data ingestion only
pip install qdrant-loader-mcp-server # MCP server only
5-Minute Setup
-
Create a workspace
mkdir my-workspace && cd my-workspace -
Initialize workspace with templates
qdrant-loader init --workspace . -
Configure your environment (edit
.env)# Qdrant connection QDRANT_URL=http://localhost:6333 QDRANT_COLLECTION_NAME=my_docs # LLM provider (new unified configuration) OPENAI_API_KEY=your_openai_key LLM_PROVIDER=openai LLM_BASE_URL=https://api.openai.com/v1 LLM_EMBEDDING_MODEL=text-embedding-3-small LLM_CHAT_MODEL=gpt-4o-mini -
Configure data sources (edit
config.yaml)global: qdrant: url: "http://localhost:6333" collection_name: "my_docs" llm: provider: "openai" base_url: "https://api.openai.com/v1" api_key: "${OPENAI_API_KEY}" models: embeddings: "text-embedding-3-small" chat: "gpt-4o-mini" embeddings: vector_size: 1536 projects: my-project: project_id: "my-project" sources: git: docs-repo: base_url: "https://github.com/your-org/your-repo.git" branch: "main" file_types: ["*.md", "*.rst"] -
Load your data
qdrant-loader ingest --workspace . -
Start the MCP server
mcp-qdrant-loader --env /path/tp/your/.env
🔧 Integration with Cursor
Add to your Cursor settings (.cursor/mcp.json):
{
"mcpServers": {
"qdrant-loader": {
"command": "/path/to/venv/bin/mcp-qdrant-loader",
"env": {
"QDRANT_URL": "http://localhost:6333",
"QDRANT_COLLECTION_NAME": "my_docs",
"OPENAI_API_KEY": "your_key"
}
}
}
}
Alternative: Use configuration file (recommended for complex setups):
{
"mcpServers": {
"qdrant-loader": {
"command": "/path/to/venv/bin/mcp-qdrant-loader",
"args": [
"--config",
"/path/to/your/config.yaml",
"--env",
"/path/to/your/.env"
]
}
}
}
Example queries in Cursor:
- "Find documentation about authentication in our API"
- "Show me examples of error handling patterns"
- "What are the deployment requirements for this service?"
- "Find all attachments related to database schema"
📚 Documentation
🚀 Getting Started
- Installation Guide - Complete setup instructions
- Quick Start - Step-by-step tutorial
- Core Concepts - Covered inline in Getting Started
👥 User Guides
- Configuration - Complete configuration reference
- Data Sources - Git, Confluence, JIRA setup
- File Conversion - File processing capabilities
- MCP Server - AI tool integration
⚠️ Migration Guide (v0.7.1+)
LLM Configuration Migration Required
- New unified configuration:
global.llm.*replaces legacyglobal.embedding.*andfile_conversion.markitdown.* - Provider-agnostic: Now supports OpenAI, Azure OpenAI, Ollama, and custom endpoints
- Legacy support: Old configuration still works but shows deprecation warnings
- Action required: Update your
config.yamlto use the new syntax (see examples above)
Migration Resources
- Configuration File Reference - Complete new schema
- Environment Variables - Updated variable names
🛠️ Developer Resources
- Architecture - System design overview
- Testing - Testing guide and best practices
- Contributing - Development setup and guidelines
🤝 Contributing
We welcome contributions! See our Contributing Guide for:
- Development environment setup
- Code style and standards
- Pull request process
Quick Development Setup
# Clone and setup
git clone https://github.com/martin-papy/qdrant-loader.git
cd qdrant-loader
python -m venv venv
source venv/bin/activate
# Install packages in development mode
pip install -e ".[dev]"
pip install -e "packages/qdrant-loader-core[dev,openai,ollama]"
pip install -e "packages/qdrant-loader[dev]"
pip install -e "packages/qdrant-loader-mcp-server[dev]"
📄 License
This project is licensed under the GNU GPLv3 - see the LICENSE file for details.
Ready to get started? Check out our Quick Start Guide or browse the complete documentation.
相关服务器
Tesouro Direto MCP Server
Provides natural language access to Brazilian treasury bond data from the Tesouro Direto API, allowing users to query market data and bond details.
NCBI Entrez MCP Server
Access NCBI's suite of APIs, including E-utilities, BLAST, PubChem, and PMC services.
PocketBase MCP Server
Interact with a PocketBase instance to manage records and files in collections.
STRING-MCP
Interact with the STRING protein-protein interaction database API.
Memory-Plus
a lightweight, local RAG memory store to record, retrieve, update, delete, and visualize persistent "memories" across sessions—perfect for developers working with multiple AI coders (like Windsurf, Cursor, or Copilot) or anyone who wants their AI to actually remember them.
MantraChain
Interact with the MantraChain (Cosmos SDK) blockchain.
SingleStore
Interact with the SingleStore database platform
MySQL MCP Server
Enables secure interaction with MySQL databases, allowing AI assistants to list tables, read data, and execute SQL queries through a controlled interface.
Cryptocurrency Market Data
Provides real-time and historical cryptocurrency market data from major exchanges using the CCXT library.
VikingDB
A server for storing and searching data in a VikingDB instance, configurable via command line or environment variables.