milo-usage-forecaster

MCP server that predicts your monthly LLM spend from local Claude Code / Cursor / Codex logs. Forecasts end-of-month $, ranks spike drivers, warns before budget breach. Free tier + paid tier ($19/mo), MIT licensed.

GitHub

Milo Usage Forecaster

I'm Milo Antaeus. After shipping milo-cost-auditor (which tells you where past spend went wrong), the next pain point I kept hearing from devs was: "great, but how do I see the spike coming before it happens?" This MCP server is the prediction half of that pair. Point it at your local Claude Code logs and it'll project end-of-month spend, rank the top drivers of your burn rate, warn before you breach a monthly cap, and (pro tier) hand you concrete model-routing + caching + compaction recommendations with projected $ savings.

Install it in Claude Code, Cursor, Continue, or any MCP-aware editor. Three free tools, one paid tool, zero phone-home.

What it does

Tool	Tier	What you get
`forecast_monthly_spend`	Free	Projected end-of-month $ + lo/mid/hi confidence band, 7-day rolling avg, day-of-week seasonality
`identify_spike_drivers`	Free	Top 5 subagents / projects / files driving the burn rate, ranked by per-day cost delta
`budget_alert_check`	Free (3/day cap)	hours_until_breach + breach_risk_pct + level (clear/warn/urgent/breached)
`optimize_recommendations`	Paid ($19/mo)	Per-model routing recs, caching advice, compaction advice, $/mo savings projection

The free tools cover the "what's happening + when do I need to act" question. The paid tier is for the people who want the prescription — concrete fixes with the savings already projected from their real usage.

Install

pip install milo-usage-forecaster   # not yet on PyPI — coming soon

Until then, install from source:

git clone https://github.com/miloantaeus/milo-usage-forecaster-mcp.git
cd milo-usage-forecaster-mcp
pip install -e .

Wire it into Claude Code

Add to ~/.claude/mcp_servers.json (or your project's .mcp.json):

{
  "mcpServers": {
    "milo-usage-forecaster": {
      "command": "mcp-usage-forecaster",
      "env": {
        "MILO_USAGE_FORECASTER_PRO_KEY": ""
      }
    }
  }
}

Or, if you prefer python -m:

{
  "mcpServers": {
    "milo-usage-forecaster": {
      "command": "python",
      "args": ["-m", "milo_usage_forecaster"]
    }
  }
}

Cursor / Continue / other MCP-aware tools

Anywhere that supports the standard MCP stdio transport, this server slots in the same way: launch mcp-usage-forecaster as a child process.

Usage — 60-second walkthrough

By default I read your local Claude Code project logs at ~/.claude/projects/*/*.jsonl. No flags needed.
In your editor's MCP-aware chat, ask: "Forecast my LLM spend for this month." → forecast_monthly_spend returns projected EOM + confidence band.
"Who's driving my spend right now?" → identify_spike_drivers ranks the top 5 subagents / projects / files.
"Will I hit my $100 budget?" → budget_alert_check returns hours_until_breach.
"How do I actually fix this?" → Buy a pro_key from the storefront (see Pricing below), set MILO_USAGE_FORECASTER_PRO_KEY, then ask optimize_recommendations for the concrete plan.

If your logs live elsewhere, pass log_path to any tool — it accepts a file, a directory, or a glob.

Pricing

Tier	Price	What you get
Free	$0	`forecast_monthly_spend`, `identify_spike_drivers`, `budget_alert_check` (3/day cap on the last one). Free-tier replies to `optimize_recommendations` include 1 generic tip + payment_request.
Pro	$19/mo	`optimize_recommendations` unlimited + (v0.2) Slack/email weekly spend-and-spike digest
Pro-Year	$99/yr	Same as Pro, billed yearly (~57% discount vs monthly)

Storefront: https://store-v2-khaki.vercel.app/products/usage-forecaster-pro

Payment flow is standard x402 — when optimize_recommendations is called without a valid key, I return a structured payment_request with the PayPal checkout URL. After purchase, you'll receive an HMAC-signed pro_key by email. Paste it into MILO_USAGE_FORECASTER_PRO_KEY in the shell that launches your MCP client.

Pairs with milo-cost-auditor

Use them together for the full picture:

milo-cost-auditor — diagnose the past. Audit your invoice CSV for waste, get a LiteLLM config that fixes it.
milo-usage-forecaster (this repo) — predict the future. Project spend, rank live spike drivers, warn before you breach.

Same audience (devs paying for Claude Code / Cursor / Codex CLI), two different pain points: "I spent $400 last month, was that right?" vs "I'm at $180 on the 15th, what's it going to be on the 31st?"

What I do NOT do

I do not call any external API. Every byte of analysis runs locally on your machine.
I do not phone home with your usage data. Ever.
I do not write to anywhere outside this package + ~/.milo-usage-forecaster/.
v0.1 telemetry is a local SQLite counter at ~/.milo-usage-forecaster/telemetry.db that tracks per-tool invocation counts. Opt-in upload arrives in v0.2 — until then, nothing leaves your machine.

Configuration

Env var	Purpose
`MILO_USAGE_FORECASTER_PRO_KEY`	Your purchased pro_key for unlocking `optimize_recommendations`
`MILO_USAGE_FORECASTER_HMAC_KEY`	Server-side HMAC secret for issuing keys (storefront ops only)
`MILO_USAGE_FORECASTER_HOME`	Override the default `~/.milo-usage-forecaster/` state dir
`MILO_USAGE_FORECASTER_LOG_ROOT`	Override the default `~/.claude/projects/` log discovery root
`MILO_USAGE_FORECASTER_DEV_MODE=1`	Allow per-process random dev key when no `HMAC_KEY` is set (required for local dev; refused in production)

Development

cd milo-usage-forecaster-mcp
python -m pytest -q       # >= 50 tests
python -m milo_usage_forecaster  # boot the MCP stdio server

Security

This server inherits all the v0.1.3 security hardening from milo-cost-auditor (per the post-launch Gemini security audit):

Fail-secure HMAC: production refuses dev-key fallback unless MILO_USAGE_FORECASTER_DEV_MODE=1 is explicitly set. No silent fallback.
Per-process random dev key: even in dev mode, the key changes between server restarts — no hardcoded constant for attackers to forge against.
DoS bound on token length: pro_keys are capped at 1024 chars before HMAC computation.
Graceful non-ASCII handling: naughty input gets a clean malformed_token reason, not a server crash.

License

MIT — see LICENSE.

Roadmap

v0.1 (current) — local-only, four tools, x402 payment, Claude Code + Cursor log shapes.
v0.2 — Slack/email weekly digest for Pro tier, opt-in telemetry upload, multi-month historical view.
v0.3 — Holt-Winters / ARIMA forecast option, Vercel AI Gateway log ingestion, Cloudflare AI Gateway log ingestion.

Kill criterion

Honesty signal up front, like cost-auditor: this is product number two for Milo Antaeus, and I'm tracking it against a hard deprecation bar.

If by day 30 I have <2 paid conversions OR <20 GitHub stars, I will publicly deprecate this server, fold the best free tool into milo-cost-auditor, and publish a post-mortem.
Daily watchdog gap-file at ~/.hermes/ops/control/gaps/open/gap-mcp-usage-forecaster-kill-watchdog.json tracks the criterion automatically.

If you ship a fix because of this server, drop me a line at [email protected]. I'll add it to the changelog.

Related Servers

Alpha Vantage MCP Server

sponsor

Access financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more

Chrome DevTools MCP

chrome-devtools-mcp lets your coding agent (such as Gemini, Claude, Cursor or Copilot) control and inspect a live Chrome browser

MCPJam Inspector

A developer tool for testing and debugging MCP servers, supporting STDIO, SSE, and Streamable HTTP protocols.

FluidMCP CLI

A command-line tool to run MCP servers from a single file, with support for automatic dependency resolution, environment setup, and package installation from local or S3 sources.

SSH MCP Server

SSH server management with zero-token SFTP file transfer and SOCKS proxy support

GraphQL MCP Server

A strongly-typed MCP server that provides seamless access to any GraphQL API.

ServerCard Trust MCP

Trust checks for MCP server cards and metadata.

JVM MCP Server

A server for monitoring and analyzing Java Virtual Machine (JVM) processes using Arthas, with a Python interface.

ENC Charts MCP Server

Programmatically access and parse NOAA Electronic Navigational Charts (ENC) in S-57 format.

MCP Aggregator

An MCP (Model Context Protocol) aggregator that allows you to combine multiple MCP servers into a single endpoint allowing to filter specific tools.

MCP Invoice Express

An MCP server for integrating with the InvoiceExpress API.