firecrawl-parsebởi firecrawl
Turn a local document into clean markdown on disk. Supports PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, HTML/HTM/XHTML .
npx skills add https://github.com/firecrawl/cli --skill firecrawl-parsefirecrawl parse
Turn a local document into clean markdown on disk. Supports PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, HTML/HTM/XHTML.
When to use
- You have a file on disk (not a URL) and want its text as markdown
- User drops a PDF/DOCX and asks what it says, or to summarize it
- Use
scrapeinstead when the source is a URL
Quick start
Always save to .firecrawl/ with -o — parsed docs can be hundreds of KB and blow up context if streamed to stdout. Add .firecrawl/ to .gitignore.
mkdir -p .firecrawl
# File → markdown
firecrawl parse ./paper.pdf -o .firecrawl/paper.md
# AI summary
firecrawl parse ./paper.pdf -S -o .firecrawl/paper-summary.md
# Ask a question about the doc
firecrawl parse ./paper.pdf -Q "What are the main conclusions?" \
-o .firecrawl/paper-qa.md
Then head, grep, rg etc., or incrementally read the file - don't load the whole thing at once.
Options
| Option | Description |
|---|---|
-S, --summary | AI-generated summary |
-Q, --query <prompt> | Ask a question about the parsed content |
-o, --output <path> | Output file path — always use this |
-f, --format <fmt> | markdown (default), html, summary |
--timeout <ms> | Timeout for the parse job |
--timing | Show request duration |
Tips
- Quote paths with spaces:
firecrawl parse "./My Doc.pdf" -o .firecrawl/mydoc.md. - Max upload size: 50 MB per file.
- Credits: ~1 per PDF page; HTML is 1 flat.
- Check
.firecrawl/before re-parsing the same file. - To check your credit balance (recommended for batch processing and similar workflows), use the
firecrawl credit-usagecommand.
See also
- firecrawl-scrape — same idea for URLs
Thêm skills từ firecrawl
oracle
by firecrawl
Best practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).
ordercli
by firecrawl
Foodora-only CLI for checking past orders and active order status (Deliveroo WIP).
peekaboo
by firecrawl
Capture and automate macOS UI with the Peekaboo CLI.
sag
by firecrawl
ElevenLabs text-to-speech with mac-style say UX.
session-logs
by firecrawl
Search and analyze your own session logs (older/parent conversations) using jq.
sherpa-onnx-tts
by firecrawl
Local text-to-speech via sherpa-onnx (offline, no cloud)
sentry
by firecrawl
sentry — an installable skill for AI agents, published by firecrawl/skills.
wordpress-router
by firecrawl
Use when the user asks about WordPress codebases (plugins, themes, block themes, Gutenberg blocks, WP core checkouts) and you need to quickly classify the repo…