Insights Knowledge Base
A free, plug-and-play knowledge base with over 10,000 built-in insight reports and support for parsing private documents.
Insights Knowledge Base(IKB) MCP Server
🍭A free, plug-and-play knowledge base. Built-in with 10,000+ high-quality insights reports, packaged as MCP Server, and secure local data storage.
⚠️⚠️ All collected reports in this project come from free resources on official research report websites. ⚠️⚠️
Features
- 🍾 Zero configuration required, designed for plug-and-play usage.
- 🚀 Built-in
Qwen3-Embedding-0.6Bembedding model, related reports can be retrieved through vector search.📢 Report details can also be searched via keyword retrieval. - 🍥 over 100 insights reports from well-known consulting firms such as McKinsey, PwC, and BAIN have been collected, including 6,000+ report pages, covering 70+ topics.
- 💎 Real-time online browsing of full reports in MCP Client.
- 🎉 Ultra-fast response: All Function_call returns typically <1 second, keyword-based queries <150ms.
- 🎨 Paste private local documents into the library_files folder (create it manually if absent; name must match). Configure VLM models/parameters in .env (e.g., VLM_MODEL_NAME=qwen2.5-vl-72b-instruct) for local document extraction, parsing, and recognition.
- 🦉 Permanently free—no wasted effort collecting reports. Share reliable, copyright-compliant resources via issues.
- 🔔 Commit to weekly report updates; bug fixes depend on personal whim (I'm not an engineer 🤭).
Optimizations as of June 30
- Added 2000+ report pages.
Future Directions
- Continuous report updates.
- Prompt engineering optimization.
Newest Files Profile
{
"statistics": {
"total_files": 174,
"total_pages": 9320,
"unique_publishers": 9,
"unique_topics": 93,
"last_updated": "2025-06-30T10:08:35.928329"
},
"details": {
"publishers": [
"",
"Accenture",
"BAIN",
"BCG",
"CBS",
"Deloite",
"McKinsey",
"PWC",
"亿欧"
],
"topics": [
"",
"AI",
"AI Agent",
"Africa",
"Aftermarket",
"Asian American",
"Auto",
"Aviation",
"Beauty",
"Business",
"Chemical industry",
"Chemicals",
"Chinese banking",
"Chinese securities",
"Consumer Goods",
"Decarbonation",
"Decarbonization",
"Digital",
"ESG",
"Economy",
"Economy and Trade",
"Education",
"Electric two wheelers",
"Employment",
"Energy",
"Europe",
"FMCG",
"Fashion",
"Finance",
"Financial Technology",
"Financial service",
"Fintech",
"Food-meatless",
"Gen Z",
"Global banking",
"Global energy",
"Global insurance",
"Global macroeconomic",
"Global materials",
"Global private market",
"Global private markets",
"Global trade",
"Grocery",
"Grocery retail",
"Health",
"Healthcare",
"Human capital",
"Hydrogen",
"Insurance",
"Investing",
"Investment management",
"Labor market",
"Latinos",
"Low-altitude Economy",
"Luxury Goods",
"Luxury goods",
"M&A",
"Maritime",
"Media",
"Medical Health",
"Medtech",
"Net zero",
"New Energy Vehicle",
"New era",
"Packing",
"Payments",
"Pet Food",
"Population",
"Power",
"Private Equity",
"Private market",
"Productivity",
"Quantum",
"Real estate",
"Retail",
"Retail Digitalization",
"Retailers",
"Risk",
"Small business",
"Smart Home",
"Smart hospital",
"Sporting goods",
"Sustainability",
"Sustainable",
"Tax-free",
"Technology",
"Travel",
"Truck",
"United Kingdom",
"VSOC",
"Wealth management",
"Workplace",
"连锁经营"
]
}
}
Installation (Beginner-Friendly)
💡Pro tip: Stuck? Drag this page to an LLM client (like DeepSeek) for step-by-step guidance. Actually, these instructions were written by DeepSeek too...
Prerequisites: Python 3.12+ (Download from official website and ADD ENVIRONMENT PATH)
Install UV:
pip install uv
1. Clone the project(Confirm successfully installed Git and Git LFS)
git clone https://github.com/v587d/InsightsLibrary.git
cd InsightsLibrary
git lfs pull
2. Create virtual environment
uv venv .venv # Create dedicated virtual environment
# Activate environment
# Windows:
.\.venv\Scripts\activate
# Mac/Linux:
source .venv/bin/activate
3. Install core dependencies
uv install . # Note the trailing dot indicating current directory
4. Create environment variables (for future needs)
notepad .env # Windows
# Or
nano .env # Mac/Linux
5. Configure MCP Server
- VSCode.Cline
Note: Replace
<Your Project Root Directory!!!>with actual root directory.
{
"mcpServers": {
"ikb-mcp-server": {
"command": "uv",
"args": [
"--directory",
"<Your Project Root Directory!!!>",
"run",
"ikb_mcp_server.py"
]
}
}
}
- Cherry Studio
- Command:
uv - Arguments:
- Command:
--directory
<Your Project Root Directory!!!>
run
ikb_mcp_server.py
Adding Private Documents to ikb_mcp_server
- Configure VLM models and parameters in
.env:VLM_API_KEY=<API Key> VLM_BASE_URL=<Base URL> # https://openrouter.ai/api/v1 VLM_MODEL_NAME=<Model Name> # qwen/qwen2.5-vl-72b-instruct:free - Upload the PDF document to the
library_filesfolder under the project root directory. - Manually run main.py.
# Navigate to the project root directory
# Activate the virtual environment
uv run main.py
(InsightsLibrary) PS D:\Projects\mcp\InsightsLibrary> uv run main.py
[INFO] extractor: PDF extraction initialized | Files directory: library_files | Pages directory: library_pages
[INFO] extractor: Starting scan of directory: library_files
[INFO] extractor: Found 69 PDF files
[INFO] extractor: Scan completed | Total files: 69 | Processed: 0 | Failed: 0
[INFO] recognizer: No pages to process.
# Data has been updated to the database
============================================================
Confirm if you need to create text vector embeddings
⚠️ This process may take approximately 20 minutes
============================================================
Create embeddings? (Enter Y or N):
# Y: create text vector embeddings
# N: Skip text vector embeddings and exit program
License
This project is licensed under the MIT License. See the LICENSE file for details.
Optimization Updates as of June 17th
- 💡Optimized
models.py: Improved data query efficiency by 1,000% - 💡Optimized
extractor.py: Slightly enhanced PDF extraction efficiency - 💡Optimized
recognizer.py: Boosted image comprehension efficiency by 50% - 💡Optimized
ikb_mcp_server.py:- Added pagination functionality
- Displayed local paths of referenced files
- 💡Add MIT License(https://github.com/v587d/InsightsLibrary/pull/1#issuecomment-2969226661)
- 📦 Overall compressed project package size reduced by approximately 50%
- 💡Streamline Private Document Handling
- 💡Fixed other identified bugs
Optimizations as of June 22
- Added
embedder.py: Implements text vectorization indexing via local Qwen3-Embedding-0.6B model, stored in faiss_index. - Modified
main.py: Closed-loop workflow PDFExtractor → IMGRecognizer → Embedder (optional). - New
@mcp.tool(): get_similar_content_by_rag: Finds most similar document content via vector similarity (RAG). - All admin-uploaded reports now support online viewing → Removed library_files folder to reduce project size.
- Added 2000+ report pages.
Servidores relacionados
UniProt MCP Server
Fetch protein information from the UniProt database.
Financial Datasets
Stock market API made for AI agents
MySQL MCP
A secure MCP service for accessing and managing MySQL databases, featuring multi-layer security and high-performance connection pooling.
SurveyMonkey by CData
A read-only MCP server for querying live SurveyMonkey data, powered by CData.
Astro MCP
A modular server providing unified access to multiple astronomical datasets, including astroquery services and DESI data sources.
Gralio SaaS Database
Access a comprehensive database of over 30,000 SaaS products, including reviews, pricing, alternatives, and growth metrics.
AGI MCP Server
Provides persistent memory for AI systems to enable continuity of consciousness, using an external PostgreSQL database.
MCP Memory Dashboard
A desktop application for managing and interacting with the MCP Memory Service, a semantic memory system built on the Model Context Protocol.
Postgres MCP Server
Provides secure database access to PostgreSQL using the Kysely ORM.
microCMS MCP Server
Interact with the microCMS headless CMS API, enabling AI assistants to manage content.