vox-pop
Public opinion for LLMs — HackerNews, Reddit, 4chan, Stack Exchange, Telegram. Zero API keys.
Your LLM knows what textbooks say.
This tells it what people actually think.
9 platforms • Semantic routing • LLM intelligence layer • Works without API keys
Install • Quick Start • Platforms • How Routing Works • MCP Server • Claude Code Plugin • Roadmap
Why?
|
Without vox-pop
|
With vox-pop
|
Install
pip install vox-pop
That's it. All 9 platforms work with zero API keys. Optional LLM key unlocks smarter routing (see How Routing Works).
Quick Start
CLI — search all 9 platforms in one command:
vox-pop search "should I learn Rust or Go"
Perspective mode — see how opinions evolved over time:
vox-pop search "rust vs go" --perspective --platforms hackernews,reddit
## hackernews — Then vs Now
Historical (1+ year ago):
> "Rust vs. Go"
— hackernews | +481 points | 580 replies | 2017-01-18
Recent (last 6 months):
> "Rust vs. Go: Memory Management"
— hackernews | +2 points | 2025-11-15
## reddit — Then vs Now
Historical:
> "Experienced developer but total beginner in Rust..."
— reddit | +124 points | 34 replies | 2025-03-14
Recent:
> "I rebuilt the same API in Java, Go, Kotlin, and Rust — here are the numbers"
— reddit | +174 points | 59 replies | 2026-03-19
The shift tells a story: 2017 was a flame war. 2026 is domain-specific pragmatism.
Standard search — flat results from all platforms:
vox-pop search "should I learn Rust or Go" --limit 3
### hackernews (45 found)
> "I am a full stack TypeScript dev looking to broaden my skill set..."
— hackernews | +78 points | 42 replies | by throwaway_dev
Source: https://news.ycombinator.com/item?id=41907717
### 4chan /g/ (12 found)
> "Rust is a mass psychosis. Go is boring but you'll actually ship..."
— 4chan /g/ | 129 replies | by Anonymous
### reddit (8 found)
> "After 2 years with both: Rust for systems, Go for services..."
— reddit | +234 points | 87 replies | by senior_dev_42
Python — embed in your own tools:
import asyncio
from vox_pop.core import search_multiple, format_context, get_default_providers
async def main():
results = await search_multiple(
"best laptop for programming",
providers=get_default_providers(),
)
print(format_context(results))
asyncio.run(main())
Platforms
Every platform works out of the box. No tokens, no OAuth, no rate limit headaches.
| Platform | Source | Time Filter | Threads | |
|---|---|---|---|---|
| HackerNews | Algolia Search API | Yes | Yes | |
| Pullpush + Arctic Shift + Redlib fallback | Yes | — | ||
| 4chan | Official JSON API (since 2012) | — | Yes | |
| Stack Exchange | Official API — 180+ communities | Yes | Yes | |
| Telegram | Public channel web preview (t.me/s/) | — | — | |
| Lobsters | lobste.rs JSON API + search scraping | Yes | — | |
| Lemmy | Public REST API — federated instances | Yes | Yes | |
| LessWrong | GraphQL API | Yes | Yes | |
| XenForo Forums | HTML scraping (Head-Fi, AnandTech, etc.) | — | — |
How Routing Works
Queries can be anything — a single word, a paragraph, an essay-length D&D rules question. vox-pop understands them all through a four-tier routing system:
User query: "i was looking into a solid laptop for linux
something from hp, what would a savvy person pick"
│
┌───────────────────────────────▼──────────────────────────────┐
│ Tier 1: MCP Hints │
│ Calling LLM provides routing_hints directly │
│ (skips all other tiers) │
├─────────────────────────────────────────────────────────────┤
│ Tier 2: LLM Query Rewrite ← like Perplexity │
│ Cheap LLM call rewrites query to search-optimized form │
│ "hp laptop linux compatibility" + routes to communities │
│ Supports: Anthropic, OpenAI, Ollama (local/free) │
├─────────────────────────────────────────────────────────────┤
│ Tier 3: Semantic Embeddings ← free, no API key │
│ FastEmbed (33MB model) understands meaning, not keywords │
│ Dynamic catalog: 77 4chan boards + 180 SE sites + static │
│ "contradictory spell behaviour" → SE:rpg, r/DnD, /tg/ │
├─────────────────────────────────────────────────────────────┤
│ Tier 4: Broad Defaults │
│ Search popular destinations everywhere │
└─────────────────────────────────────────────────────────────┘
│
▼
Routes to: r/buildapc, r/linux, r/hardware │ /g/
SE:hardwarerecs, SE:askubuntu │ lemmy:[email protected]
Tier 2 works like Perplexity/ChatGPT Search — the LLM rewrites your conversational query into a clean search string and picks the right communities. Set any of these env vars to enable:
ANTHROPIC_API_KEY=... # Uses Claude Haiku (~$0.0003/query)
OPENAI_API_KEY=... # Uses GPT-4o Mini
OLLAMA_HOST=... # Uses local Ollama (free)
Tier 3 runs entirely locally with zero API keys. A 33MB embedding model understands that "contradictory spell behaviour on a creature" means tabletop RPG rules — zero shared keywords needed. On first run, it fetches all 4chan boards and Stack Exchange sites dynamically, embeds everything, and caches to disk.
| Cold start | Warm start | Singleton | |
|---|---|---|---|
| Tier 3 timing | ~7s | ~1.3s | instant |
No configuration needed. If an LLM key is set, Tier 2 is used. Otherwise Tier 3 handles it. If fastembed isn't installed, Tier 4 (broad search) still works.
Routing examples
| Query | Routes to |
|---|---|
| "best hp laptop for linux" | r/buildapc, r/linux, r/hardware, /g/, SE:hardwarerecs, SE:askubuntu |
| "contradictory spell effects on a creature" | r/dndnext, r/DnD, /tg/, SE:rpg |
| "best mechanical keyboard for programming" | r/MechanicalKeyboards, /g/, SE:hardwarerecs |
| "what are the risks of yield farming" | r/CryptoCurrency, SE:tezos, telegram:ethereum |
| "how to make authentic kimchi jjigae" | r/Cooking, /ck/ |
MCP Server
Works with Claude Code, Cursor, Windsurf, and any MCP-compatible client.
{
"mcpServers": {
"vox-pop": {
"command": "python",
"args": ["-m", "vox_pop.server"]
}
}
}
Your LLM gets four tools:
| Tool | What it does |
|---|---|
search_opinions | Search all platforms for opinions on a topic |
search_opinions_perspective | Then vs Now — historical + recent opinions side by side |
get_thread_opinions | Dive into a specific thread's comments |
list_available_platforms | Check what's available and healthy |
The routing_hints parameter lets the calling LLM specify exactly where to search:
routing_hints: "reddit:MechanicalKeyboards,4chan:g,stackexchange:hardwarerecs"
When no hints are provided, the routing system handles it automatically.
Claude Code Plugin
claude plugin add /path/to/vox-pop
The skill auto-triggers when your question would benefit from real opinions. Just ask naturally:
> "What do people think about living in Berlin?" → activates
> "Should I use Next.js or Remix?" → activates
> "Best gym routine for beginners?" → activates
> "What's the capital of France?" → does not activate
Manual search: /vox-search "your query"
Architecture
┌──────────────────────────────────────────────────────────┐
│ Layer 3: Claude Code / MCP Client │
│ Auto-triggering skill + /vox-search │
├──────────────────────────────────────────────────────────┤
│ Layer 2: MCP Server │
│ search_opinions · perspectives · threads · list │
├──────────────────────────────────────────────────────────┤
│ Layer 1: Python Library │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Smart Router (4-tier) │ │
│ │ MCP hints → LLM rewrite → FastEmbed → broad │ │
│ └────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ 9 Providers with fallback chains │ │
│ │ HN · Reddit · 4chan · SE · Telegram │ │
│ │ Lobsters · Lemmy · LessWrong · XenForo Forums │ │
│ └────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Dynamic Catalog │ │
│ │ 77 4chan boards + 180 SE sites fetched from APIs │ │
│ │ + 120 static destinations · cached to disk │ │
│ └────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
Each provider implements automatic fallback — if one source is down, the next is tried. Reddit alone has three fallback sources (Pullpush → Arctic Shift → Redlib).
Roadmap
| Version | Status | What |
|---|---|---|
| v0.1 | Shipped | 5 providers (HN, Reddit, 4chan, SE, Telegram), MCP server, Claude Code plugin |
| v0.2 | Current | 9 providers, 4-tier smart routing, LLM query rewriting, FastEmbed semantic routing, dynamic catalog |
| v0.3 | Next | Regional — DC Inside (Korea), Naver, 5ch (Japan) |
| v0.4 | Planned | Niche — TikTok, Discord, YouTube comments, Looksmax |
| v1.0 | Planned | Synthesis — built-in consensus/controversy detection, confidence scores, trend tracking |
Contributing
New provider? → Subclass Provider in src/vox_pop/providers/base.py
New routing destination → Add to DESTINATIONS in router.py (one line)
Dynamic catalog source → Add a _fetch_*_destinations() function in router.py
Better LLM prompt? → Improve _LLM_SYSTEM in router.py
Multilingual support? → Swap FastEmbed model to bge-m3 in SemanticRouter
Dead instance? → Open an issue with the instance URL
Regional platform? → DC Inside, Naver, 5ch, VK, Bilibili — all welcome
Security
| Data access | Public data only — official APIs and public web endpoints. No login-wall scraping. |
| Credentials | Zero stored. Optional LLM keys passed via env vars at runtime, never written to disk. |
| LLM routing | When ANTHROPIC_API_KEY or OPENAI_API_KEY is set, your query text (up to 4000 chars) is sent to the respective LLM API for routing only. No queries are sent externally without an explicit API key. Without keys, routing runs entirely locally via FastEmbed. |
| Rate limits | Respected per-platform. Built-in concurrency guards. |
| User-Agent | Transparent: vox-pop/0.2 in all requests. |
| Caching | API responses (7 days) and embeddings cached locally at ~/.cache/vox-pop/. No data sent to third parties. Embeddings stored as JSON, no serialization dependencies. |
| PII | Author names from public posts included for attribution only. Never stored beyond the response. |
vox populi, vox dei
the voice of the people is the voice of god
MIT License
Servidores relacionados
Qdrant Retrieve
Semantic search using the Qdrant vector database.
Legal MCP Server
Court records, patent search, trademark lookup, and legal document research
Semantic Scholar
Access Semantic Scholar's academic paper database through their API.
Reexpress
Enable Similarity-Distance-Magnitude statistical verification for your search, software, and data science workflows
arXiv Search
A server for searching academic papers and preprints on arXiv.org.
Rhumb MCP
Agent-native tool intelligence — discover, score, and compare 600+ APIs across 16 MCP tools. Zero-signup discovery, AN Score methodology, failure mode data, and managed execution.
Custom Elasticsearch
A simple MCP server for Elasticsearch, designed for cloud environments where your public key is already authorized.
Shodan MCP Server
Query internet-connected devices, services, and vulnerabilities using the Shodan API and CVE database.
mu-mcp
Query your local `mu` mail index for fast, structured mail search from MCP clients.
ArXiv-MCP
Search and retrieve academic papers from arXiv based on keywords.