PromptThin

The invisible savings layer for AI Agents. Save 70% on tokens with zero code changes

PromptThin

Reduce LLM API costs through caching, compression, and smart routing. Zero code changes.

PromptThin is a transparent proxy that sits between your AI agents and LLM providers. Two environment variables and you're done — every API call gets four compounding savings routes applied automatically.

Your app ──→ PromptThin ──→ OpenAI / Anthropic / Gemini / Groq

Website Free trial

Four savings routes

Route	What it does	Saving
Semantic Cache	Returns cached answers for similar questions — even if worded differently	Up to 100% on repeated queries
Prompt Compression	Compresses verbose prompts with LLMLingua 2 before sending	Up to 50% on input tokens
Model Router	Automatically routes simple tasks to cheaper models in <1ms	Up to 90% per request
Context Pruning	Summarises long conversation history when it exceeds 8K tokens	Up to 60% on long threads

All four routes run on every request. You control which to skip per-request via headers.

Get started in 2 minutes

Step 1 — Create an account

Or via API:

curl -X POST https://promptthin.tech/auth/register
-H "Content-Type: application/json"
-d '{"email": "[email protected]", "password": "yourpassword"}'

Password requirements: 8+ characters, uppercase, lowercase, number, special character. Check your inbox for a verification email before making API calls.

Step 2 — Register your LLM provider key

OpenAI

curl -X POST https://promptthin.tech/keys/openai
-H "X-API-Key: ts_your_key"
-H "Content-Type: application/json"
-d '{"key": "sk-your-openai-key"}'

Anthropic

curl -X POST https://promptthin.tech/keys/anthropic
-H "X-API-Key: ts_your_key"
-H "Content-Type: application/json"
-d '{"key": "sk-ant-your-anthropic-key"}'

Gemini

curl -X POST https://promptthin.tech/keys/gemini
-H "X-API-Key: ts_your_key"
-H "Content-Type: application/json"
-d '{"key": "AIza-your-gemini-key"}'

Groq

curl -X POST https://promptthin.tech/keys/groq
-H "X-API-Key: ts_your_key"
-H "Content-Type: application/json"
-d '{"key": "gsk_your-groq-key"}'

Your provider keys are encrypted with AES-256 and never appear in logs or responses.

Step 3 — Point your app at PromptThin

.env — two lines, no other changes needed

OPENAI_BASE_URL=https://promptthin.tech/v1 OPENAI_API_KEY=ts_your_key

Done. Every LLM call now routes through PromptThin and savings start immediately.

Integration examples

OpenAI SDK — Python

from openai import OpenAI

client = OpenAI( base_url="https://promptthin.tech/v1", api_key="ts_your_key", )

response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] )

OpenAI SDK — JavaScript / TypeScript

import OpenAI from "openai";

const client = new OpenAI({ baseURL: "https://promptthin.tech/v1", apiKey: "ts_your_key", });

Anthropic SDK — Python

import anthropic

client = anthropic.Anthropic( base_url="https://promptthin.tech", api_key="ts_your_key", )

Anthropic SDK — JavaScript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ baseURL: "https://promptthin.tech", apiKey: "ts_your_key", });

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI( base_url="https://promptthin.tech/v1", api_key="ts_your_key", model="gpt-4o", )

AutoGen

config_list = [{ "model": "gpt-4o", "base_url": "https://promptthin.tech/v1", "api_key": "ts_your_key", }]

CrewAI / any OpenAI-compatible framework

OPENAI_BASE_URL=https://promptthin.tech/v1 OPENAI_API_KEY=ts_your_key

Vercel AI SDK

import { createOpenAI } from "@ai-sdk/openai";

const openai = createOpenAI({ baseURL: "https://promptthin.tech/v1", apiKey: "ts_your_key", });

LiteLLM

import litellm

litellm.api_base = "https://promptthin.tech/v1" litellm.api_key = "ts_your_key"

Cursor / Continue.dev / Open WebUI

In settings, set:

OpenAI API Base URL: https://promptthin.tech/v1
API Key: ts_your_key

Supported models

PromptThin infers the provider from the model name automatically:

Model prefix	Routes to
gpt-, o1-, o3-*	OpenAI
claude-*	Anthropic
gemini-*	Google Gemini
llama-, mixtral-, gemma-*	Groq

Preview savings before committing

Use the POST /predict-savings endpoint to get a cost estimate before making a real LLM call — no tokens billed, no LLM call made:

curl -X POST https://promptthin.tech/predict-savings
-H "X-API-Key: ts_your_key"
-H "Content-Type: application/json"
-d '{ "model": "gpt-4o", "provider": "openai", "messages": [ {"role": "user", "content": "your long prompt here..."} ] }'

Response:

{ "original_tokens": 4200, "estimated_tokens_after_savings": 2100, "estimated_cost_original": 0.0105, "estimated_cost_after_savings": 0.0013, "estimated_saving": 0.0092, "saving_percent": 87.5, "recommendation": "proceed" }

MCP server

Add to claude_desktop_config.json:

{ "mcpServers": { "promptthin": { "url": "https://promptthin.tech/mcp", "headers": { "X-API-Key": "ts_your_key" } } } }

Available tools: get_usage_summary, get_billing_status, flush_cache, get_recent_requests.

Per-request controls

Header	Value	Effect
X-Cache-Control	no-cache	Skip cache lookup and storage
X-Prune-Control	no-prune	Skip context pruning
X-Compress-Control	no-compress	Skip prompt compression
X-Router-Control	no-route	Skip model routing

Pricing

Plan	Price	Requests
No card	Free	20 requests to explore
Pro	7-day free trial · then $4.99 first month · then $11.99/mo	Unlimited
Enterprise	Custom	Unlimited + SLA + dedicated support

Start free trial →

Security

Provider keys encrypted with AES-256 — never in logs or responses
Email verification required before making API calls
Strong passwords enforced (8+ chars, upper, lower, number, special character)
All traffic HTTPS only
Keys stored in GCP Secret Manager

FAQ

**Do I need to change my code?**No. Set two environment variables.

**Does PromptThin slow down my requests?**Cache hits completely skip the LLM call — dramatically lower latency. Cache misses add <2ms overhead.

What if I want to pass my provider key directly?

client = OpenAI( base_url="https://promptthin.tech/v1", api_key="ts_your_key", default_headers={"Authorization": "Bearer sk-your-openai-key"}, )

PromptThin detects the key prefix and uses it directly.

**Can I use multiple providers?**Yes. Register keys for each provider. PromptThin routes to the right one based on the model name.

**What happens after the 7-day trial?**Your card is charged $4.99 for the first month, then $11.99/month. Cancel anytime from the dashboard — no charge if cancelled within 7 days.

Contact

Website: promptthin.tech
Enterprise: [email protected]
Issues: Open an issue on this repository

İlgili Sunucular

Word MCP Server

Create and edit Microsoft Word (.docx) documents via an API.

VAP media MCP

: MCP server for AI media generation (imagesflux, videosveo3.1, music suno v5, with deterministic cost control using reserve-burn-refund billing

MCP Google Workspace

A comprehensive MCP server for managing Google Workspace services like Calendar, Contacts, and Gmail using OAuth2 authentication.

MCP Conductor

An advanced MCP server for intelligent conversation context management and session continuity, requiring the Claude Desktop application and a Node.js environment.

Rememberizer

Access personal and team knowledge from documents and Slack discussions.

SuperLocalMemory V2

Universal, local-first persistent memory for AI assistants. SQLite-based knowledge graph with zero cloud dependencies. Works with 17+ tools (Claude, Cursor, Windsurf, VS Code, etc.). 100% free forever.

Rezdy Agent

Search marketplace products, manage bookings, and handle customer relationships using the Rezdy Agent API.

Spire.XLS MCP Server

Create, read, edit, and convert Excel files without requiring Microsoft Office.

Gemini Data Analysis & Research

Leverages Google's Gemini AI for data analysis, research paper generation, and automated email delivery.

JIRA

Integrate Atlassian JIRA into any MCP-compatible application to manage issues and projects.

PromptThin

PromptThin

Four savings routes

Get started in 2 minutes

Step 1 — Create an account

Step 2 — Register your LLM provider key

OpenAI

Anthropic

Gemini

Groq

Step 3 — Point your app at PromptThin

.env — two lines, no other changes needed

Integration examples

OpenAI SDK — Python

OpenAI SDK — JavaScript / TypeScript

Anthropic SDK — Python

Anthropic SDK — JavaScript

LangChain

AutoGen

CrewAI / any OpenAI-compatible framework

Vercel AI SDK

LiteLLM

Cursor / Continue.dev / Open WebUI

Supported models

Preview savings before committing

MCP server

Per-request controls

Pricing

Security

FAQ

Contact

İlgili Sunucular

Word MCP Server

VAP media MCP

MCP Google Workspace

MCP Conductor

Rememberizer

SuperLocalMemory V2

Rezdy Agent

Spire.XLS MCP Server

Gemini Data Analysis & Research

JIRA

NotebookLM Web Importer