Insights Knowledge Base
A free, plug-and-play knowledge base with over 10,000 built-in insight reports and support for parsing private documents.
Insights Knowledge Base(IKB) MCP Server
🍭A free, plug-and-play knowledge base. Built-in with 10,000+ high-quality insights reports, packaged as MCP Server, and secure local data storage.
⚠️⚠️ All collected reports in this project come from free resources on official research report websites. ⚠️⚠️
Features
- 🍾 Zero configuration required, designed for plug-and-play usage.
- 🚀 Built-in
Qwen3-Embedding-0.6Bembedding model, related reports can be retrieved through vector search.📢 Report details can also be searched via keyword retrieval. - 🍥 over 100 insights reports from well-known consulting firms such as McKinsey, PwC, and BAIN have been collected, including 6,000+ report pages, covering 70+ topics.
- 💎 Real-time online browsing of full reports in MCP Client.
- 🎉 Ultra-fast response: All Function_call returns typically <1 second, keyword-based queries <150ms.
- 🎨 Paste private local documents into the library_files folder (create it manually if absent; name must match). Configure VLM models/parameters in .env (e.g., VLM_MODEL_NAME=qwen2.5-vl-72b-instruct) for local document extraction, parsing, and recognition.
- 🦉 Permanently free—no wasted effort collecting reports. Share reliable, copyright-compliant resources via issues.
- 🔔 Commit to weekly report updates; bug fixes depend on personal whim (I'm not an engineer 🤭).
Optimizations as of June 30
- Added 2000+ report pages.
Future Directions
- Continuous report updates.
- Prompt engineering optimization.
Newest Files Profile
{
"statistics": {
"total_files": 174,
"total_pages": 9320,
"unique_publishers": 9,
"unique_topics": 93,
"last_updated": "2025-06-30T10:08:35.928329"
},
"details": {
"publishers": [
"",
"Accenture",
"BAIN",
"BCG",
"CBS",
"Deloite",
"McKinsey",
"PWC",
"亿欧"
],
"topics": [
"",
"AI",
"AI Agent",
"Africa",
"Aftermarket",
"Asian American",
"Auto",
"Aviation",
"Beauty",
"Business",
"Chemical industry",
"Chemicals",
"Chinese banking",
"Chinese securities",
"Consumer Goods",
"Decarbonation",
"Decarbonization",
"Digital",
"ESG",
"Economy",
"Economy and Trade",
"Education",
"Electric two wheelers",
"Employment",
"Energy",
"Europe",
"FMCG",
"Fashion",
"Finance",
"Financial Technology",
"Financial service",
"Fintech",
"Food-meatless",
"Gen Z",
"Global banking",
"Global energy",
"Global insurance",
"Global macroeconomic",
"Global materials",
"Global private market",
"Global private markets",
"Global trade",
"Grocery",
"Grocery retail",
"Health",
"Healthcare",
"Human capital",
"Hydrogen",
"Insurance",
"Investing",
"Investment management",
"Labor market",
"Latinos",
"Low-altitude Economy",
"Luxury Goods",
"Luxury goods",
"M&A",
"Maritime",
"Media",
"Medical Health",
"Medtech",
"Net zero",
"New Energy Vehicle",
"New era",
"Packing",
"Payments",
"Pet Food",
"Population",
"Power",
"Private Equity",
"Private market",
"Productivity",
"Quantum",
"Real estate",
"Retail",
"Retail Digitalization",
"Retailers",
"Risk",
"Small business",
"Smart Home",
"Smart hospital",
"Sporting goods",
"Sustainability",
"Sustainable",
"Tax-free",
"Technology",
"Travel",
"Truck",
"United Kingdom",
"VSOC",
"Wealth management",
"Workplace",
"连锁经营"
]
}
}
Installation (Beginner-Friendly)
💡Pro tip: Stuck? Drag this page to an LLM client (like DeepSeek) for step-by-step guidance. Actually, these instructions were written by DeepSeek too...
Prerequisites: Python 3.12+ (Download from official website and ADD ENVIRONMENT PATH)
Install UV:
pip install uv
1. Clone the project(Confirm successfully installed Git and Git LFS)
git clone https://github.com/v587d/InsightsLibrary.git
cd InsightsLibrary
git lfs pull
2. Create virtual environment
uv venv .venv # Create dedicated virtual environment
# Activate environment
# Windows:
.\.venv\Scripts\activate
# Mac/Linux:
source .venv/bin/activate
3. Install core dependencies
uv install . # Note the trailing dot indicating current directory
4. Create environment variables (for future needs)
notepad .env # Windows
# Or
nano .env # Mac/Linux
5. Configure MCP Server
- VSCode.Cline
Note: Replace
<Your Project Root Directory!!!>with actual root directory.
{
"mcpServers": {
"ikb-mcp-server": {
"command": "uv",
"args": [
"--directory",
"<Your Project Root Directory!!!>",
"run",
"ikb_mcp_server.py"
]
}
}
}
- Cherry Studio
- Command:
uv - Arguments:
- Command:
--directory
<Your Project Root Directory!!!>
run
ikb_mcp_server.py
Adding Private Documents to ikb_mcp_server
- Configure VLM models and parameters in
.env:VLM_API_KEY=<API Key> VLM_BASE_URL=<Base URL> # https://openrouter.ai/api/v1 VLM_MODEL_NAME=<Model Name> # qwen/qwen2.5-vl-72b-instruct:free - Upload the PDF document to the
library_filesfolder under the project root directory. - Manually run main.py.
# Navigate to the project root directory
# Activate the virtual environment
uv run main.py
(InsightsLibrary) PS D:\Projects\mcp\InsightsLibrary> uv run main.py
[INFO] extractor: PDF extraction initialized | Files directory: library_files | Pages directory: library_pages
[INFO] extractor: Starting scan of directory: library_files
[INFO] extractor: Found 69 PDF files
[INFO] extractor: Scan completed | Total files: 69 | Processed: 0 | Failed: 0
[INFO] recognizer: No pages to process.
# Data has been updated to the database
============================================================
Confirm if you need to create text vector embeddings
⚠️ This process may take approximately 20 minutes
============================================================
Create embeddings? (Enter Y or N):
# Y: create text vector embeddings
# N: Skip text vector embeddings and exit program
License
This project is licensed under the MIT License. See the LICENSE file for details.
Optimization Updates as of June 17th
- 💡Optimized
models.py: Improved data query efficiency by 1,000% - 💡Optimized
extractor.py: Slightly enhanced PDF extraction efficiency - 💡Optimized
recognizer.py: Boosted image comprehension efficiency by 50% - 💡Optimized
ikb_mcp_server.py:- Added pagination functionality
- Displayed local paths of referenced files
- 💡Add MIT License(https://github.com/v587d/InsightsLibrary/pull/1#issuecomment-2969226661)
- 📦 Overall compressed project package size reduced by approximately 50%
- 💡Streamline Private Document Handling
- 💡Fixed other identified bugs
Optimizations as of June 22
- Added
embedder.py: Implements text vectorization indexing via local Qwen3-Embedding-0.6B model, stored in faiss_index. - Modified
main.py: Closed-loop workflow PDFExtractor → IMGRecognizer → Embedder (optional). - New
@mcp.tool(): get_similar_content_by_rag: Finds most similar document content via vector similarity (RAG). - All admin-uploaded reports now support online viewing → Removed library_files folder to reduce project size.
- Added 2000+ report pages.
Related Servers
PostgreSQL
An MCP server for interacting with a PostgreSQL database.
Quick Data for Windows MCP
A Windows-optimized server for performing data analytics on JSON and CSV files, designed for Claude Desktop integration.
LoL Data MCP Server
Provides real-time, structured access to League of Legends game data, including champions, items, abilities, game mechanics, and patch information.
SQL-Transpiler MCP Tool
Transpile SQL queries between different dialects using the sqlglot library.
KnowledgeGraph MCP Server
Enables persistent knowledge storage for Claude using a knowledge graph with multiple database backends like PostgreSQL and SQLite.
EHR Tools with MCP and FHIR
Search and query patient Electronic Health Record (EHR) data using SMART on FHIR.
FHIR MCP Server by CData
A read-only MCP server for FHIR, enabling LLMs to query live FHIR data. Requires the CData JDBC Driver for FHIR.
MongoDB MCP Server
A server for interacting with MongoDB databases and MongoDB Atlas.
MCP Knowledge Graph
Provides persistent memory for AI models using a local knowledge graph.
Vertica MCP Server
Provides read-only access to Vertica databases.