CC Token Saver
Use a local LLM for smaller or specialized tasks within Claude to save tokens.
cc_token_saver_mcp
Allow Claude code to use local llm for smaller tasks to save token or for specialized task.
Reduce your Claude Code tokens with ‘CC token saver’ MCP server that intelligently delegates simple tasks to your local LLM while keeping Claude Code for complex coordination and architecture decisions.
The MCP server exposes your local LLM as tools that Claude Code can use for:
- Code snippet generation
- Simple refactoring tasks
- Documentation writing
- Code reviews
- Basic Q&A Claude Code automatically tries the local LLM first for simple tasks, only using premium tokens when necessary for complex reasoning and multi-step workflows.
MCP server config
Create a .env file with the LLM config
Example:
# Local LLM Configuration
OPENAI_API_KEY=none
OPENAI_BASE_URL=http://localhost:1234/v1
LOCAL_MODEL_NAME=qwen2.5-7b-instruct
LOCAL_LLM_TEMPERATURE=0.7
LOCAL_LLM_MAX_TOKENS=-1
Claude Code MCP config
edit the ~/.claude.json file
"mcpServers": {
"cc-token-saver": {
"type": "stdio",
"command": "python",
"args": [
"<path>/cc_token_saver_mcp/server.py"
]
}
},
Example usage:
Related Servers
Scout Monitoring MCP
sponsorPut performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
Helm Package README MCP Server
Search and retrieve detailed information, including READMEs, for Helm charts on Artifact Hub.
MCP Installer
Set up MCP servers in Claude Desktop
plugged.in MCP Proxy Server
A middleware that aggregates multiple Model Context Protocol (MCP) servers into a single unified interface.
Tinyman MCP
An MCP server for the Tinyman protocol on the Algorand blockchain, offering tools for swaps, liquidity provision, and pool management.
Context Portal MCP (ConPort)
A server for managing structured project context using SQLite, with support for vector embeddings for semantic search and Retrieval Augmented Generation (RAG).
Dify Server
Integrates the Dify AI API to generate Ant Design business component code. Supports text, image inputs, and streaming responses.
Markdown Sidecar MCP Server
Serve and access markdown documentation for locally installed NPM, Go, or PyPi packages.
Dev/Infra
MCP server that gives LLMs full control over local Kubernetes dev environments via k3d, kubectl, Tilt, Helm, and kustomize
Jenkins API MCP Server
A server for managing Jenkins jobs through its REST API, including operations like building, configuration, and information retrieval.
gopls-mcp
The essential MCP server for Go language: Exposing compiler-grade semantics to AI Agents and LLM for deterministic code analysis and minimal token usage.