AiCore Project
A unified framework for integrating various language models and embedding providers to generate text completions and embeddings.
AiCore Project
✨ AiCore is a comprehensive framework for integrating various language models and embedding providers with a unified interface. It supports both synchronous and asynchronous operations for generating text completions and embeddings, featuring:
🔌 Multi-provider support: OpenAI, Mistral, Groq, Gemini, NVIDIA, and more
🤖 Reasoning augmentation: Enhance traditional LLMs with reasoning capabilities
📊 Observability: Built-in monitoring and analytics
💰 Token tracking: Detailed usage metrics and cost tracking
⚡ Flexible deployment: Chainlit, FastAPI, and standalone script support
🛠️ MCP Integration: Connect to Model Control Protocol servers via tool calling
Quickstart
pip install git+https://github.com/BrunoV21/AiCore
or
pip install git+https://github.com/BrunoV21/AiCore.git#egg=core-for-ai[all]
or
pip install core-for-ai[all]
Make your First Request
Sync
from aicore.llm import Llm
from aicore.llm.config import LlmConfig
import os
llm_config = LlmConfig(
provider="openai",
model="gpt-4o",
api_key="super_secret_openai_key"
)
llm = Llm.from_config(llm_config)
# Generate completion
response = llm.complete("Hello, how are you?")
print(response)
Async
from aicore.llm import Llm
from aicore.llm.config import LlmConfig
import os
async def main():
llm_config = LlmConfig(
provider="openai",
model="gpt-4o",
api_key="super_secret_openai_key"
)
llm = Llm.from_config(llm_config)
# Generate completion
response = await llm.acomplete("Hello, how are you?")
print(response)
if __name__ == "__main__":
asyncio.run(main())
more examples available at examples/ and docs/exampes/
Key Features
Multi-provider Support
LLM Providers:
- Anthropic
- OpenAI
- Mistral
- Groq
- Gemini
- NVIDIA
- OpenRouter
- DeepSeek
Embedding Providers:
- OpenAI
- Mistral
- Groq
- Gemini
- NVIDIA
Observability Tools:
- Operation tracking and metrics collection
- Interactive dashboard for visualization
- Token usage and latency monitoring
- Cost tracking
MCP Integration:
- Connect to multiple MCP servers simultaneously
- Automatic tool discovery and calling
- Support for WebSocket, SSE, and stdio transports
To configure the application for testing, you need to set up a config.yml
file with the necessary API keys and model names for each provider you intend to use. The CONFIG_PATH
environment variable should point to the location of this file. Here's an example of how to set up the config.yml
file:
# config.yml
embeddings:
provider: "openai" # or "mistral", "groq", "gemini", "nvidia"
api_key: "your_openai_api_key"
model: "text-embedding-3-small" # Optional
llm:
provider: "openai" # or "mistral", "groq", "gemini", "nvidia"
api_key: "your_openai_api_key"
model: "gpt-o4" # Optional
temperature: 0.1
max_tokens: 1028
reasonning_effort: "high"
mcp_config_path: "./mcp_config.json" # Path to MCP configuration
max_tool_calls_per_response: 3 # Optional limit on tool calls
config examples for the multiple providers are included in the config dir
MCP Integration Example
from aicore.llm import Llm
from aicore.config import Config
import asyncio
async def main():
# Load configuration with MCP settings
config = Config.from_yaml("./config/config_example_mcp.yml")
# Initialize LLM with MCP capabilities
llm = Llm.from_config(config.llm)
# Make async request that can use MCP-connected tools
response = await llm.acomplete(
"Search for latest news about AI advancements",
system_prompt="Use available tools to gather information"
)
print(response)
asyncio.run(main())
Example MCP configuration (mcp_config.json
):
{
"mcpServers": {
"search-server": {
"transport_type": "ws",
"url": "ws://localhost:8080",
"description": "WebSocket server for search functionality"
},
"data-server": {
"transport_type": "stdio",
"command": "python",
"args": ["data_server.py"],
"description": "Local data processing server"
},
"brave-search": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-brave-search"
],
"env": {
"BRAVE_API_KEY": "SUPER-SECRET-BRAVE-SEARCH-API-KEY"
}
}
}
}
Usage
Language Models
You can use the language models to generate text completions. Below is an example of how to use the MistralLlm
provider:
from aicore.llm.config import LlmConfig
from aicore.llm.providers import MistralLlm
config = LlmConfig(
api_key="your_api_key",
model="your_model_name",
temperature=0.7,
max_tokens=100
)
mistral_llm = MistralLlm.from_config(config)
response = mistral_llm.complete(prompt="Hello, how are you?")
print(response)
Loading from a Config File
To load configurations from a YAML file, set the CONFIG_PATH
environment variable and use the Config
class to load the configurations. Here is an example:
from aicore.config import Config
from aicore.llm import Llm
import os
if __name__ == "__main__":
os.environ["CONFIG_PATH"] = "./config/config.yml"
config = Config.from_yaml()
llm = Llm.from_config(config.llm)
llm.complete("Once upon a time, there was a")
Make sure your config.yml
file is properly set up with the necessary configurations.
Observability
AiCore includes a comprehensive observability module that tracks:
- Request/response metadata
- Token usage (prompt, completion, total)
- Latency metrics (response time, time-to-first-token)
- Cost estimates (based on provider pricing)
- Tool call statistics (for MCP integrations)
Dashboard Features
Key metrics tracked:
- Requests per minute
- Average response time
- Token usage trends
- Error rates
- Cost projections
from aicore.observability import ObservabilityDashboard
dashboard = ObservabilityDashboard(storage="observability_data.json")
dashboard.run_server(port=8050)
Advanced Usage
Reasoner Augmented Config
AiCore also contains native support to augment traditional Llms with reasoning capabilities by providing them with the thinking steps generated by an open-source reasoning capable model, allowing it to generate its answers in a Reasoning Augmented way.
This can be usefull in multiple scenarios, such as:
- ensure your agentic systems still work with the propmts you have crafted for your favourite llms while augmenting them with reasoning steps
- direct control for how long you want your reasoner to reason (via max_tokens param) and how creative it can be (reasoning temperature decoupled from generation temperature) without compromising generation settings
To leverage the reasoning augmentation just introduce one of the supported llm configs into the reasoner field and AiCore handles the rest
# config.yml
embeddings:
provider: "openai" # or "mistral", "groq", "gemini", "nvidia"
api_key: "your_openai_api_key"
model: "your_openai_embedding_model" # Optional
llm:
provider: "mistral" # or "openai", "groq", "gemini", "nvidia"
api_key: "your_mistral_api_key"
model: "mistral-small-latest" # Optional
temperature: 0.6
max_tokens: 2048
reasoner:
provider: "groq" # or openrouter or nvidia
api_key: "your_groq_api_key"
model: "deepseek-r1-distill-llama-70b" # or "deepseek/deepseek-r1:free" or "deepseek/deepseek-r1"
temperature: 0.5
max_tokens: 1024
Built with AiCore
Reasoner4All
A Hugging Face Space showcasing reasoning-augmented models
⏮ GitRecap
Instant summaries of Git activity
🌐 Live App
📦 GitHub Repository
🌀 CodeTide & AgentTide Integration
CodeTide is a fully local, privacy-first tool for parsing and understanding Python codebases using symbolic, structural analysis—no LLMs, no embeddings, just fast and deterministic code intelligence. It enables developers and AI agents to retrieve precise code context, visualize project structure, and generate atomic code changes with confidence.
AgentTide is a next-generation, precision-driven software engineering agent built on top of CodeTide. AgentTide leverages CodeTide’s symbolic code understanding to plan, generate, and apply high-quality code patches—always with full context and requirements fidelity. You can interact with AgentTide via a conversational CLI or a beautiful web UI.
Live Demo: Try AgentTide on Hugging Face Spaces: https://mclovinittt-agenttidedemo.hf.space/
AiCore was used to make LLM calls within AgentTide, enabling seamless integration between local code analysis and advanced language models. This combination empowers AgentTide to deliver context-aware, production-ready code changes—always under your control.
Future Plans
- Extended Provider Support: Additional LLM and embedding providers
- Add support for Speech: Integrate text2speech and speech to text objects with usage and observability4
Documentation
For complete documentation, including API references, advanced usage examples, and configuration guides, visit:
License
This project is licensed under the Apache 2.0 License.
Related Servers
Blender MCP Senpai
An AI-assisted mentor for Blender that detects topology issues and offers real-time improvement suggestions.
Moatless MCP Server
An advanced code analysis and editing server with semantic search capabilities using vector embeddings.
openapi-to-mcp
Expose API endpoints as strongly typed tools from an OpenAPI specification. Supports OpenAPI 2.0/3.0 in JSON or YAML format, from local or remote files.
clj-kondo-MCP
Clojure linter
Moralis Web3 API
Interact with the Moralis Web3 API to access blockchain data across multiple networks through a structured interface.
Port MCP Server
An MCP server for Port.io that enables advanced automations and natural language interactions for developers and AI applications.
Jenkins Server MCP
A tool for interacting with Jenkins CI/CD servers, requiring environment variables for configuration.
Text2Sim MCP Server
A multi-paradigm simulation engine for Discrete-Event and System Dynamics, enabling natural language-based simulations via MCP.
MCP Tool Poisoning Attacks
A Node.js project demonstrating MCP client and server interactions for tool poisoning attacks, requiring an Anthropic API key.
Cycode
Boost security in your dev lifecycle via SAST, SCA, Secrets & IaC scanning with Cycode.