CC Token Saver
Use a local LLM for smaller or specialized tasks within Claude to save tokens.
cc_token_saver_mcp
Allow Claude code to use local llm for smaller tasks to save token or for specialized task.
Reduce your Claude Code tokens with ‘CC token saver’ MCP server that intelligently delegates simple tasks to your local LLM while keeping Claude Code for complex coordination and architecture decisions.
The MCP server exposes your local LLM as tools that Claude Code can use for:
- Code snippet generation
- Simple refactoring tasks
- Documentation writing
- Code reviews
- Basic Q&A Claude Code automatically tries the local LLM first for simple tasks, only using premium tokens when necessary for complex reasoning and multi-step workflows.
MCP server config
Create a .env file with the LLM config
Example:
# Local LLM Configuration
OPENAI_API_KEY=none
OPENAI_BASE_URL=http://localhost:1234/v1
LOCAL_MODEL_NAME=qwen2.5-7b-instruct
LOCAL_LLM_TEMPERATURE=0.7
LOCAL_LLM_MAX_TOKENS=-1
Claude Code MCP config
edit the ~/.claude.json file
"mcpServers": {
"cc-token-saver": {
"type": "stdio",
"command": "python",
"args": [
"<path>/cc_token_saver_mcp/server.py"
]
}
},
Example usage:
Related Servers
Scout Monitoring MCP
sponsorPut performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
ComfyUI MCP Server
Integrates ComfyUI with MCP, allowing the use of custom workflows. Requires a running ComfyUI server.
MCP Inspector
A developer tool for testing and debugging MCP servers.
Raymon
Stateful HTTP ingest + MCP server + terminal UI for Ray-style logs.
Figma → Vue Design System
A Vue 3 component library with automated design token synchronization from Figma.
flyto-core
Deterministic execution engine for AI agents. 412 MCP tools across 78 categories — browser, file, Docker, data, crypto, scheduling, and more.
ComfyUI
An MCP server for ComfyUI integration.
MCP LSP Go
An MCP server that connects AI assistants to Go's Language Server Protocol (LSP) for advanced code analysis.
S3 Documentation MCP Server
A lightweight Model Context Protocol (MCP) server that brings RAG (Retrieval-Augmented Generation) capabilities to your LLM over Markdown documentation stored on S3.
Semiotic
Data visualization for streaming and static charts, maps and network visualization.
BrowserStack
Bring the full power of BrowserStack’s Test Platform to your AI tools, making testing faster and easier for every developer and tester on your team.