firecrawl-scrape

作者： firecrawl

從任何URL提取乾淨的Markdown，包括JavaScript渲染的單頁應用程式。支援靜態頁面和JS渲染的SPA，並可設定渲染等待時間。支援多個並行URL抓取，輸出格式選項包括Markdown、HTML、連結和螢幕截圖。包含內容過濾選項，如僅主內容模式以去除導航和頁尾，以及標籤包含/排除功能。可選的內聯問答功能，透過--query標誌進行目標性...

npx skills add https://github.com/firecrawl/cli --skill firecrawl-scrape

下載 ZIP GitHub

489

firecrawl scrape

Scrape one or more URLs. Returns clean, LLM-optimized markdown. Multiple URLs are scraped concurrently.

When to use

You have a specific URL and want its content
The page is static or JS-rendered (SPA)
Step 2 in the workflow escalation pattern: search → scrape → map → crawl → interact

Quick start

# Basic markdown extraction
firecrawl scrape "<url>" -o .firecrawl/page.md

# Main content only, no nav/footer
firecrawl scrape "<url>" --only-main-content -o .firecrawl/page.md

# Wait for JS to render, then scrape
firecrawl scrape "<url>" --wait-for 3000 -o .firecrawl/page.md

# Multiple URLs (each saved to .firecrawl/)
firecrawl scrape https://example.com https://example.com/blog https://example.com/docs

# Get markdown and links together
firecrawl scrape "<url>" --format markdown,links -o .firecrawl/page.json

# Ask a question about the page
firecrawl scrape "https://example.com/pricing" --query "What is the enterprise plan price?"

Options

Option	Description
`-f, --format <formats>`	Output formats: markdown, html, rawHtml, links, screenshot, json
`-Q, --query <prompt>`	Ask a question about the page content (5 credits)
`-H`	Include HTTP headers in output
`--only-main-content`	Strip nav, footer, sidebar — main content only
`--wait-for <ms>`	Wait for JS rendering before scraping
`--include-tags <tags>`	Only include these HTML tags
`--exclude-tags <tags>`	Exclude these HTML tags
`--redact-pii`	Redact personally identifiable information from output
`-o, --output <path>`	Output file path

Tips

Prefer plain scrape over --query. Scrape to a file, then use grep, head, or read the markdown directly — you can search and reason over the full content yourself. Use --query only when you want a single targeted answer without saving the page (costs 5 extra credits).
Try scrape before interact. Scrape handles static pages and JS-rendered SPAs. Only escalate to interact when you need interaction (clicks, form fills, pagination).
Multiple URLs are scraped concurrently — check firecrawl --status for your concurrency limit.
Single format outputs raw content. Multiple formats (e.g., --format markdown,links) output JSON.
Always quote URLs — shell interprets ? and & as special characters.
Naming convention: .firecrawl/{site}-{path}.md