Memory Cache Server
An MCP server that reduces token consumption by efficiently caching data between language model interactions.
Memory Cache Server
A Model Context Protocol (MCP) server that reduces token consumption by efficiently caching data between language model interactions. Works with any MCP client and any language model that uses tokens.
Installation
- Clone the repository:
git clone git@github.com:ibproduct/ib-mcp-cache-server
cd ib-mcp-cache-server
- Install dependencies:
npm install
- Build the project:
npm run build
- Add to your MCP client settings:
{
  "mcpServers": {
    "memory-cache": {
      "command": "node",
      "args": ["/path/to/ib-mcp-cache-server/build/index.js"]
    }
  }
}
- The server will automatically start when you use your MCP client
Verifying It Works
When the server is running properly, you'll see:
- A message in the terminal: "Memory Cache MCP server running on stdio"
- Improved performance when accessing the same data multiple times
- No action required from you - the caching happens automatically
You can verify the server is running by:
- Opening your MCP client
- Looking for any error messages in the terminal where you started the server
- Performing operations that would benefit from caching (like reading the same file multiple times)
Configuration
The server can be configured through config.json or environment variables:
{
  "maxEntries": 1000,        // Maximum number of items in cache
  "maxMemory": 104857600,    // Maximum memory usage in bytes (100MB)
  "defaultTTL": 3600,        // Default time-to-live in seconds (1 hour)
  "checkInterval": 60000,    // Cleanup interval in milliseconds (1 minute)
  "statsInterval": 30000     // Stats update interval in milliseconds (30 seconds)
}
Configuration Settings Explained
- 
maxEntries (default: 1000) - Maximum number of items that can be stored in cache
- Prevents cache from growing indefinitely
- When exceeded, oldest unused items are removed first
 
- 
maxMemory (default: 100MB) - Maximum memory usage in bytes
- Prevents excessive memory consumption
- When exceeded, least recently used items are removed
 
- 
defaultTTL (default: 1 hour) - How long items stay in cache by default
- Items are automatically removed after this time
- Prevents stale data from consuming memory
 
- 
checkInterval (default: 1 minute) - How often the server checks for expired items
- Lower values keep memory usage more accurate
- Higher values reduce CPU usage
 
- 
statsInterval (default: 30 seconds) - How often cache statistics are updated
- Affects accuracy of hit/miss rates
- Helps monitor cache effectiveness
 
How It Reduces Token Consumption
The memory cache server reduces token consumption by automatically storing data that would otherwise need to be re-sent between you and the language model. You don't need to do anything special - the caching happens automatically when you interact with any language model through your MCP client.
Here are some examples of what gets cached:
1. File Content Caching
When reading a file multiple times:
- First time: Full file content is read and cached
- Subsequent times: Content is retrieved from cache instead of re-reading the file
- Result: Fewer tokens used for repeated file operations
2. Computation Results
When performing calculations or analysis:
- First time: Full computation is performed and results are cached
- Subsequent times: Results are retrieved from cache if the input is the same
- Result: Fewer tokens used for repeated computations
3. Frequently Accessed Data
When the same data is needed multiple times:
- First time: Data is processed and cached
- Subsequent times: Data is retrieved from cache until TTL expires
- Result: Fewer tokens used for accessing the same information
Automatic Cache Management
The server automatically manages the caching process by:
- Storing data when first encountered
- Serving cached data when available
- Removing old/unused data based on settings
- Tracking effectiveness through statistics
Optimization Tips
1. Set Appropriate TTLs
- Shorter for frequently changing data
- Longer for static content
2. Adjust Memory Limits
- Higher for more caching (more token savings)
- Lower if memory usage is a concern
3. Monitor Cache Stats
- High hit rate = good token savings
- Low hit rate = adjust TTL or limits
Environment Variable Configuration
You can override config.json settings using environment variables in your MCP settings:
{
  "mcpServers": {
    "memory-cache": {
      "command": "node",
      "args": ["/path/to/build/index.js"],
      "env": {
        "MAX_ENTRIES": "5000",
        "MAX_MEMORY": "209715200",  // 200MB
        "DEFAULT_TTL": "7200",      // 2 hours
        "CHECK_INTERVAL": "120000",  // 2 minutes
        "STATS_INTERVAL": "60000"    // 1 minute
      }
    }
  }
}
You can also specify a custom config file location:
{
  "env": {
    "CONFIG_PATH": "/path/to/your/config.json"
  }
}
The server will:
- Look for config.json in its directory
- Apply any environment variable overrides
- Use default values if neither is specified
Testing the Cache in Practice
To see the cache in action, try these scenarios:
- 
File Reading Test - Read and analyze a large file
- Ask the same question about the file again
- The second response should be faster as the file content is cached
 
- 
Data Analysis Test - Perform analysis on some data
- Request the same analysis again
- The second analysis should use cached results
 
- 
Project Navigation Test - Explore a project's structure
- Query the same files/directories again
- Directory listings and file contents will be served from cache
 
The cache is working when you notice:
- Faster responses for repeated operations
- Consistent answers about unchanged content
- No need to re-read files that haven't changed
Related Servers
- ChromaDB- Provides AI assistants with persistent memory using ChromaDB vector storage. 
- Supavec MCP Server- Fetch relevant content from Supavec, a vector database service. 
- CData Bullhorn CRM- A read-only MCP server by CData that enables LLMs to query live data from Bullhorn CRM. Requires the CData JDBC Driver for Bullhorn CRM. 
- Power BI MCP Servers- Integrate with Power BI using a local server for offline .pbix file analysis and an Azure server for querying live datasets. 
- MCP Vertica- A server for managing and querying Vertica databases, including connection, schema, and security management. 
- MCP Alchemy- Explore, query, and analyze SQLAlchemy-compatible databases directly from your desktop. 
- Snowflake Stored Procedure Integration- Integrates and executes Snowflake stored procedures through an MCP server. 
- Supermarket Database- A dockerized PostgreSQL database project for a supermarket data schema, with MCP integration for Claude Desktop. 
- Chroma MCP Server- An MCP server for the Chroma embedding database, providing persistent, searchable working memory for AI-assisted development with features like automated context recall and codebase indexing. 
- Talk with Your Database- Interact with PostgreSQL, MySQL, MariaDB, and SQLite databases using SQLAlchemy.