vox-pop

LLMのための世論 — HackerNews、Reddit、4chan、Stack Exchange、Telegram。APIキー不要。

GitHub

ドキュメント

Your LLM knows what textbooks say.
This tells it what people actually think.

9 platforms • Semantic routing • LLM intelligence layer • Works without API keys

Install • Quick Start • Platforms • How Routing Works • MCP Server • Claude Code Plugin • Roadmap

Why?

Without vox-pop

> How do I debloat my face?

Lymphatic drainage, reduce sodium,
cold compress, drink water,
sleep elevated...

(correct but soulless — same answer
 as every health blog since 2015)

With vox-pop

★ Searched: Reddit, 4chan /fit/, SE Fitness

Consensus (70%+ of threads):
 → Reduce sodium + 3L water/day
 → Sleep elevated on back

Controversial:
 → Gua sha: loved on Reddit,
   mocked on /fit/ as placebo

What actually worked:
 → "Cut dairy for 2 weeks — face
    visibly deflated" (847↑ r/SCA)
 → "Minox bloat is real, went away
    month 3" (/fit/, recurring)

⚠ Some suggestions are unvetted.

Install

pip install vox-pop

That's it. All 9 platforms work with zero API keys. Optional LLM key unlocks smarter routing (see How Routing Works).

Quick Start

CLI — search all 9 platforms in one command:

vox-pop search "should I learn Rust or Go"

Perspective mode — see how opinions evolved over time:

vox-pop search "rust vs go" --perspective --platforms hackernews,reddit

## hackernews — Then vs Now

Historical (1+ year ago):
> "Rust vs. Go"
  — hackernews | +481 points | 580 replies | 2017-01-18

Recent (last 6 months):
> "Rust vs. Go: Memory Management"
  — hackernews | +2 points | 2025-11-15

## reddit — Then vs Now

Historical:
> "Experienced developer but total beginner in Rust..."
  — reddit | +124 points | 34 replies | 2025-03-14

Recent:
> "I rebuilt the same API in Java, Go, Kotlin, and Rust — here are the numbers"
  — reddit | +174 points | 59 replies | 2026-03-19

The shift tells a story: 2017 was a flame war. 2026 is domain-specific pragmatism.

Standard search — flat results from all platforms:

vox-pop search "should I learn Rust or Go" --limit 3

### hackernews (45 found)
> "I am a full stack TypeScript dev looking to broaden my skill set..."
  — hackernews | +78 points | 42 replies | by throwaway_dev
  Source: https://news.ycombinator.com/item?id=41907717

### 4chan /g/ (12 found)
> "Rust is a mass psychosis. Go is boring but you'll actually ship..."
  — 4chan /g/ | 129 replies | by Anonymous

### reddit (8 found)
> "After 2 years with both: Rust for systems, Go for services..."
  — reddit | +234 points | 87 replies | by senior_dev_42

Python — embed in your own tools:

import asyncio
from vox_pop.core import search_multiple, format_context, get_default_providers

async def main():
    results = await search_multiple(
        "best laptop for programming",
        providers=get_default_providers(),
    )
    print(format_context(results))

asyncio.run(main())

Platforms

Every platform works out of the box. No tokens, no OAuth, no rate limit headaches.

Platform	Source	Time Filter	Threads
HackerNews	Algolia Search API	Yes	Yes
Reddit	Pullpush + Arctic Shift + Redlib fallback	Yes	—
4chan	Official JSON API (since 2012)	—	Yes
Stack Exchange	Official API — 180+ communities	Yes	Yes
Telegram	Public channel web preview (`t.me/s/`)	—	—
Lobsters	lobste.rs JSON API + search scraping	Yes	—
Lemmy	Public REST API — federated instances	Yes	Yes
LessWrong	GraphQL API	Yes	Yes
XenForo Forums	HTML scraping (Head-Fi, AnandTech, etc.)	—	—

How Routing Works

Queries can be anything — a single word, a paragraph, an essay-length D&D rules question. vox-pop understands them all through a four-tier routing system:

User query: "i was looking into a solid laptop for linux
             something from hp, what would a savvy person pick"
                                    │
    ┌───────────────────────────────▼──────────────────────────────┐
    │  Tier 1: MCP Hints                                          │
    │  Calling LLM provides routing_hints directly                │
    │  (skips all other tiers)                                    │
    ├─────────────────────────────────────────────────────────────┤
    │  Tier 2: LLM Query Rewrite          ← like Perplexity      │
    │  Cheap LLM call rewrites query to search-optimized form     │
    │  "hp laptop linux compatibility" + routes to communities    │
    │  Supports: Anthropic, OpenAI, Ollama (local/free)           │
    ├─────────────────────────────────────────────────────────────┤
    │  Tier 3: Semantic Embeddings         ← free, no API key     │
    │  FastEmbed (33MB model) understands meaning, not keywords   │
    │  Dynamic catalog: 77 4chan boards + 180 SE sites + static   │
    │  "contradictory spell behaviour" → SE:rpg, r/DnD, /tg/     │
    ├─────────────────────────────────────────────────────────────┤
    │  Tier 4: Broad Defaults                                     │
    │  Search popular destinations everywhere                     │
    └─────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    Routes to: r/buildapc, r/linux, r/hardware │ /g/
               SE:hardwarerecs, SE:askubuntu │ lemmy:[email protected]

Tier 2 works like Perplexity/ChatGPT Search — the LLM rewrites your conversational query into a clean search string and picks the right communities. Set any of these env vars to enable:

ANTHROPIC_API_KEY=...   # Uses Claude Haiku (~$0.0003/query)
OPENAI_API_KEY=...      # Uses GPT-4o Mini
OLLAMA_HOST=...         # Uses local Ollama (free)

Tier 3 runs entirely locally with zero API keys. A 33MB embedding model understands that "contradictory spell behaviour on a creature" means tabletop RPG rules — zero shared keywords needed. On first run, it fetches all 4chan boards and Stack Exchange sites dynamically, embeds everything, and caches to disk.

	Cold start	Warm start	Singleton
Tier 3 timing	~7s	~1.3s	instant

No configuration needed. If an LLM key is set, Tier 2 is used. Otherwise Tier 3 handles it. If fastembed isn't installed, Tier 4 (broad search) still works.

Routing examples

Query	Routes to
"best hp laptop for linux"	r/buildapc, r/linux, r/hardware, /g/, SE:hardwarerecs, SE:askubuntu
"contradictory spell effects on a creature"	r/dndnext, r/DnD, /tg/, SE:rpg
"best mechanical keyboard for programming"	r/MechanicalKeyboards, /g/, SE:hardwarerecs
"what are the risks of yield farming"	r/CryptoCurrency, SE:tezos, telegram:ethereum
"how to make authentic kimchi jjigae"	r/Cooking, /ck/

MCP Server

Works with Claude Code, Cursor, Windsurf, and any MCP-compatible client.

{
  "mcpServers": {
    "vox-pop": {
      "command": "python",
      "args": ["-m", "vox_pop.server"]
    }
  }
}

Your LLM gets four tools:

Tool	What it does
`search_opinions`	Search all platforms for opinions on a topic
`search_opinions_perspective`	Then vs Now — historical + recent opinions side by side
`get_thread_opinions`	Dive into a specific thread's comments
`list_available_platforms`	Check what's available and healthy

The routing_hints parameter lets the calling LLM specify exactly where to search:

routing_hints: "reddit:MechanicalKeyboards,4chan:g,stackexchange:hardwarerecs"

When no hints are provided, the routing system handles it automatically.

Claude Code Plugin

claude plugin add /path/to/vox-pop

The skill auto-triggers when your question would benefit from real opinions. Just ask naturally:

> "What do people think about living in Berlin?"    → activates
> "Should I use Next.js or Remix?"                  → activates
> "Best gym routine for beginners?"                 → activates
> "What's the capital of France?"                   → does not activate

Manual search: /vox-search "your query"

Architecture

┌──────────────────────────────────────────────────────────┐
│  Layer 3: Claude Code / MCP Client                       │
│  Auto-triggering skill + /vox-search                     │
├──────────────────────────────────────────────────────────┤
│  Layer 2: MCP Server                                     │
│  search_opinions · perspectives · threads · list         │
├──────────────────────────────────────────────────────────┤
│  Layer 1: Python Library                                 │
│  ┌────────────────────────────────────────────────────┐  │
│  │  Smart Router (4-tier)                             │  │
│  │  MCP hints → LLM rewrite → FastEmbed → broad      │  │
│  └────────────────────────────────────────────────────┘  │
│  ┌────────────────────────────────────────────────────┐  │
│  │  9 Providers with fallback chains                  │  │
│  │  HN · Reddit · 4chan · SE · Telegram               │  │
│  │  Lobsters · Lemmy · LessWrong · XenForo Forums     │  │
│  └────────────────────────────────────────────────────┘  │
│  ┌────────────────────────────────────────────────────┐  │
│  │  Dynamic Catalog                                   │  │
│  │  77 4chan boards + 180 SE sites fetched from APIs   │  │
│  │  + 120 static destinations · cached to disk         │  │
│  └────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────┘

Each provider implements automatic fallback — if one source is down, the next is tried. Reddit alone has three fallback sources (Pullpush → Arctic Shift → Redlib).

Roadmap

Version	Status	What
v0.1	Shipped	5 providers (HN, Reddit, 4chan, SE, Telegram), MCP server, Claude Code plugin
v0.2	Current	9 providers, 4-tier smart routing, LLM query rewriting, FastEmbed semantic routing, dynamic catalog
v0.3	Next	Regional — DC Inside (Korea), Naver, 5ch (Japan)
v0.4	Planned	Niche — TikTok, Discord, YouTube comments, Looksmax
v1.0	Planned	Synthesis — built-in consensus/controversy detection, confidence scores, trend tracking

Contributing

New provider?          → Subclass Provider in src/vox_pop/providers/base.py
New routing destination → Add to DESTINATIONS in router.py (one line)
Dynamic catalog source → Add a _fetch_*_destinations() function in router.py
Better LLM prompt?     → Improve _LLM_SYSTEM in router.py
Multilingual support?  → Swap FastEmbed model to bge-m3 in SemanticRouter
Dead instance?         → Open an issue with the instance URL
Regional platform?     → DC Inside, Naver, 5ch, VK, Bilibili — all welcome

Security


Data access	Public data only — official APIs and public web endpoints. No login-wall scraping.
Credentials	Zero stored. Optional LLM keys passed via env vars at runtime, never written to disk.
LLM routing	When `ANTHROPIC_API_KEY` or `OPENAI_API_KEY` is set, your query text (up to 4000 chars) is sent to the respective LLM API for routing only. No queries are sent externally without an explicit API key. Without keys, routing runs entirely locally via FastEmbed.
Rate limits	Respected per-platform. Built-in concurrency guards.
User-Agent	Transparent: `vox-pop/0.2` in all requests.
Caching	API responses (7 days) and embeddings cached locally at `~/.cache/vox-pop/`. No data sent to third parties. Embeddings stored as JSON, no serialization dependencies.
PII	Author names from public posts included for attribution only. Never stored beyond the response.

vox populi, vox dei
the voice of the people is the voice of god

MIT License