firecrawl-research-index

Find the papers that answer a research query with Firecrawl Research, using semantic search, semantic and structural expansion, and in-body verification. Always use this skill for any literature-finding / paper-retrieval task — single-paper lookups or full multi-paper sets.

npx skills add https://github.com/firecrawl/skills --skill firecrawl-research-index

Firecrawl Research Index

Find the research papers that answer a research query. Some questions have a single answer; many have several — and when in doubt, lean toward returning the fuller relevant set (most relevant first) rather than narrowing to one. A reader is better served seeing the neighboring methods and papers than having them silently dropped.

There is no fixed recipe. Read the query, decide what kind it is, and choose the approach below. Some queries need a single search; others need heavy sturctural/semantic expansion. Don't run machinery a query doesn't call for.

The tools, and what each is uniquely good at

  • MCP: firecrawl_research_search_papers(query, k?) CLI: firecrawl research search-papers <query> [--k <number>] Semantic (HyDE) search over abstracts. The natural first move for almost any query. If results look thin or all-alike, re-run with a different framing (sibling domain, rival method, dataset/benchmark name) rather than giving up.

  • MCP: firecrawl_research_related_papers(seed_ids, intent, mode?, k?) CLI: firecrawl research related-papers <seedIds...> --intent <intent> [--mode <similar|citers|references>] [--k <number>] Semantic and structural expansion, ranked to your intent. This reaches papers semantic search cannot, and it's how you turn one good hit into the rest of a set. mode=similar → niche siblings; citers → who uses/builds on the seeds; references → what they build on / compare against.

  • MCP: firecrawl_research_inspect_paper(id) CLI: firecrawl research inspect-paper <id> Canonical metadata for one paper: title, abstract, authors, categories, source ids, and dates. Use it after search_papers or related_papers when you need the complete citation/metadata for a candidate, or when you have an id from elsewhere and need to confirm what paper it resolves to. This does not read the paper body; use read_paper for specific full-text questions.

  • MCP: firecrawl_research_read_paper(id, question) CLI: firecrawl research read-paper <id> --question <question> In-body passages of one paper, to verify a load-bearing constraint (a method actually used, a score actually reported, an affiliation, what a paper compares to). Use it to settle a specific doubt, not on everything.

  • MCP: firecrawl_search(query) / firecrawl_scrape(url) CLI: firecrawl search <query> / firecrawl scrape <url> General web search and page fetch, for facts that don't live in paper abstracts: benchmark leaderboards, rankings, "who scores best / is largest / is most used." Find the ranking on the web, then map the top entries back to papers with search_papers. Reach for these only when the corpus can't answer the question on its own.

Match the approach to the query

  • Single named paper ("the Qwen3 report") → one search_papers, done. This is the only case that truly wants exactly one paper.
  • Paper by description / by method or technique ("the paper that introduced X", "training-free N-gram detection of AI text") → find the best match, then assume there's a family: expand with related_papers and include the closely-related methods/papers too. Even when one paper is the exact literal match, surface and keep its neighbors — don't narrow to the single best hit and reason the rest out. Only treat it as one-answer if the query names a specific paper.
  • Enumeration / method-family ("papers that do X", "alternatives to Adam", "benchmarks for Y") → the answer is a set, and this is where related_papers earns its keep: expand several strong anchors with mode=similar, re-seed from new strong hits. One search is never enough here.
  • Exhibiting ("papers that use / exhibit property P") → the relevant papers apply P but their abstracts may not describe it. Go from P's defining paper outward via citers/references, and use read_paper to confirm a candidate actually uses P.
  • Superlative / leaderboard ("best on benchmark X", "largest", "most popular") → the ranking lives on leaderboards / the web, not in any single abstract. Use firecrawl_search / firecrawl_scrape to find the benchmark's leaderboard or rankings, read off the top models/papers, then search_papers each to get its paper. As a fallback, search the benchmark and read_paper candidates for reported numbers. The hardest kind — cast wide.
  • Org / author filtered ("from <org>", "by <author>") → topical match isn't enough; verify the affiliation/authorship (metadata or read_paper) before keeping a paper.
  • Compare-against ("what does paper X benchmark against / build on") → the answer is inside paper X: read_paper(X, ...) or related_papers([X], ..., mode="references").

Principles

  • When in doubt, include. For any topic / method / comparison question, return the relevant family, not just the single best match — err toward keeping a plausibly-relevant paper rather than dropping it. The neighboring methods are part of a good answer; don't reason close work out just because one paper is the most exact match.
  • Follow the literature, and keep what you find. The seminal source, the competing methods, the close neighbors are usually a hop away — use related_papers, and include them, not just the first hit. Stopping at one good result is the most common way to leave the reader with half an answer.
  • Verify to exclude, not to gatekeep. Use read_paper to rule a paper out when a hard constraint clearly fails (wrong org/author, doesn't actually report the score). When a paper is plausibly relevant, lean toward keeping it rather than demanding proof.
  • Only drop the clearly off-topic. Don't pad with papers you're confident are unrelated — but that's a high bar; most plausibly-relevant work should make the cut.

More skills from firecrawl

oracle
firecrawl
Best practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).
official
firecrawl-monitor
firecrawl
Detect when content on a website changes and get notified by webhook or email — no cron jobs, scrapers, or diff scripts required. Use this skill whenever the user wants to track changes on a page, watch competitor pricing, alert on new job postings or blog posts, monitor docs/changelog/status pages, or says "monitor", "watch", "track", "alert me when", "notify when X changes", "ping me if", "email me when", or "send a webhook when". A built-in AI judge filters out formatting, timestamp, and...
officialweb-scrapingresearch
firecrawl-deep-research
firecrawl
Run multi-source deep research with Firecrawl. Use when the user asks to research a topic, compare perspectives, produce a sourced briefing, investigate a technical or market question, or synthesize web evidence across many sources.
officialresearchweb-scraping
firecrawl-research-papers
firecrawl
Find and synthesize research papers, whitepapers, PDFs, technical reports, and academic sources with Firecrawl. Use when the user wants a literature review, paper summary, research landscape, or sourced synthesis from PDFs and scholarly/industry publications.
officialresearchweb-scraping
firecrawl-market-research
firecrawl
Extract market, financial, earnings, industry, and company metrics with Firecrawl. Use when the user asks for market research, industry trends, public company data, financial comparisons, earnings research, or structured market reports.
officialresearchweb-scraping
firecrawl-website-design-clone
firecrawl
Extract any website's design system into an agent-ready DESIGN.md using Firecrawl scrape evidence. Use when the user wants colors, fonts, spacing, components, layout patterns, or brand/UI guidance from a website so AI agents can create new websites, clone a look, or build pages inspired by that design.
officialdesignweb-scraping
firecrawl-knowledge-base
firecrawl
Build a knowledge base from web content with Firecrawl. Use for local reference docs, RAG-ready chunks, fine-tuning datasets, documentation mirrors, topic corpora, or LLM-ready markdown organized from web sources.
officialweb-scrapingresearch
firecrawl-lead-research
firecrawl
Produce pre-meeting lead intelligence briefs with Firecrawl. Use when the user needs company research, person research, recent news, talking points, pain points, or outreach preparation before a sales call, partnership meeting, investor conversation, or customer interview.
officialresearchweb-scraping