firecrawl-monitor

작성자: firecrawl

웹사이트 콘텐츠 변경을 감지하고 웹훅이나 이메일로 알림을 받습니다 — 크론 작업, 스크래퍼, diff 스크립트가 필요하지 않습니다. 사용자가 페이지 변경 사항을 추적하거나, 경쟁사 가격을 모니터링하거나, 새 채용 공고나 블로그 게시물에 대한 알림을 받거나, 문서/변경 로그/상태 페이지를 모니터링하거나, "모니터링", "감시", "추적", "변경 시 알림", "X가 변경되면 알림", "변경되면 알려줘", "변경 시 이메일 보내줘", "웹훅 보내줘"라고 말할 때 이 스킬을 사용하세요. 내장된 AI 판별기가 포맷, 타임스탬프 등을 필터링합니다...

npx skills add https://github.com/firecrawl/cli --skill firecrawl-monitor

firecrawl monitor

Detect when content on a website changes and get notified by webhook or email. Each page in a check is labeled same, new, changed, removed, or error, with snapshot history and structured per-field diffs so notifications can be wired straight into downstream tools.

When to use

  • The user wants to know when something changes — and be notified about it — not just read what the page says right now
  • Ongoing change detection on any URL: pricing, docs, changelogs, blogs, job boards, status pages, competitor sites, regulatory pages, product availability, hiring pages, top-N rankings (HN, leaderboards, etc.)
  • "Alert me when...", "notify me when...", "email me if...", "send a webhook when...", "ping me if X changes", "track this page"
  • Anywhere the user would otherwise wire up cron + a scraper + a diff library + SMTP themselves
  • Step 5 in the workflow escalation pattern: search → scrape → map → crawl → monitor → interact

Bias toward monitor whenever the request implies notifications or recurrence. A single page read once = scrape. A single page where the user wants to be told when it changes = monitor --page <url> --goal "..." --email|--webhook-url ....

Why use a monitor

  • Change-detection-as-a-service. Firecrawl handles fetching, diffing, judging, and notifying — all server-side. No cron, no diff library, no SMTP setup, no snapshot DB to manage.
  • Notifications first. Webhooks (monitor.page as each page finishes, monitor.check.completed after the check is reconciled) and email summaries that only fire when something actually changed or errored. External recipients confirm via per-recipient opt-in.
  • AI noise filter via --goal. Set a plain-language goal and the change judge ignores formatting, whitespace, casing, punctuation, encoding, request/session IDs, cache busters, tracking params, generic metadata, and unrelated page chrome — so notifications are about content the user actually cares about, not page churn.
  • Structured per-field diffs. JSON-mode change tracking returns keyed diffs like plans[0].price: "$19/mo" → "$24/mo" instead of a wall of unified diff. Drops straight into a Slack message, CI step, or internal tool.
  • Simple page-status model. Each page in a check returns same, new, changed, removed, or error. Easy to filter, easy to act on.
  • Snapshot history without infra. Point-in-time snapshots are kept for diffing via --retention-days; no storage to provision.
  • Watch many things at once. One monitor can watch many pages or diff every page discovered by a recurring site crawl.
  • No scheduling glue. Cron normalization and nextRunAt are computed for you, with natural-language schedules supported ("every 30 minutes", "hourly", "daily at 9:00").

Quick start

# Single page, natural-language schedule, email alert
firecrawl monitor create --name "Blog" --schedule "every 30 minutes" \
  --goal "Alert when a new blog post is published." \
  --page https://example.com/blog \
  --email [email protected]

# Multiple pages, one monitor
firecrawl monitor create --name "Product pages" --schedule "every 30 minutes" \
  --goal "Alert when pricing, docs, or changelog content changes." \
  --scrape-urls https://example.com/pricing,https://example.com/docs,https://example.com/changelog

# Whole-site crawl per check (every discovered page is diffed)
firecrawl monitor create --name "Docs site" --schedule "hourly" \
  --goal "Alert when any docs page is added, removed, or substantively changed." \
  --crawl-url https://docs.example.com

# Webhook notifications
firecrawl monitor create --name "Docs webhook" --schedule "every 30 minutes" \
  --goal "Alert when docs content changes." \
  --page https://example.com/docs \
  --webhook-url https://example.com/hook \
  --webhook-events monitor.page,monitor.check.completed

# Manage and inspect
firecrawl monitor list --limit 20
firecrawl monitor get <monitorId>
firecrawl monitor run <monitorId>             # trigger a check now
firecrawl monitor checks <monitorId>          # list all checks
firecrawl monitor check <monitorId> <checkId> --page-status changed
firecrawl monitor update <monitorId> --state paused
firecrawl monitor delete <monitorId>

Subcommands: create | list | get | update | delete | run | checks | check.

Options

OptionDescription
--name <name>Monitor name (required on create)
--goal <text>Plain-language change goal (auto-enables the AI change judge)
--schedule <text>Natural-language schedule (every 30 minutes, hourly, daily)
--cron <expression>Cron schedule (e.g. */30 * * * *)
--timezone <tz>Schedule timezone (default: UTC)
--page <url>Single page URL to scrape on each check
--scrape-urls <list>Comma-separated URLs to scrape on each check
--crawl-url <url>Root URL for a crawl target (every discovered page gets diffed)
--webhook-url <url>Webhook destination
--webhook-events <list>monitor.page, monitor.check.completed (comma-separated)
--email <list>Comma-separated email recipients
--retention-days <n>Snapshot retention window
--state <state>active or paused (update only — use --state, not --status)
--page-status <state>Filter check results: same, new, changed, removed, error
-o, --output <path>Output file path
--prettyPretty-print JSON output

Minimum schedule interval is 15 minutes. Monitoring is not available for zero-data-retention teams.

Writing a good --goal

The goal is what the AI change judge uses to decide whether a page is changed vs same. Convert the user's intent into a concise 2-3 sentence goal:

  • Start with Alert when ... and state the trigger using the user's wording.
  • Restate any scope they mentioned: top N, price, role type, region, company, topic, status, or a specific entity.
  • Add an Ignore ... sentence only for intent-specific exclusions (e.g. points/comments for rankings, marketing copy for pricing, general company-page updates for job listings).
  • Do not repeat generic noise exclusions — the judge already handles whitespace, casing, punctuation, encoding, formatting-only changes, request/session IDs, cache busters, tracking params, generic metadata noise, and unrelated page chrome.
  • Don't invent page-specific sections, entities, thresholds, exclusions, or business rules unless the user mentioned them.
  • If the user is vague or asks for "any change", keep the goal broad and don't add exclusions.
User saysGood goal
top 10 hackernews storiesAlert when stories enter, leave, or change rank within the Hacker News top 10. Ignore points, comments, and timestamps. Do not alert on changes outside the top 10.
pricing changesAlert when pricing information changes, including prices, plan names, billing periods, tiers, limits, or included features. Ignore unrelated marketing copy.
new engineering rolesAlert when a new engineering role is posted. Ignore general company-page updates unless they add, remove, or change an engineering role.
track this pageAlert when substantive visible content on this page changes.
any changeAlert when any visible page content changes, including copy, numbers, timestamps, counters, links, and layout text.

JSON-mode change tracking (structured per-field diffs)

By default monitors diff each page's markdown and return a unified text diff. When the user cares about specific structured fields (price, headline, in-stock flag, items in a list), use JSON-mode change tracking. The CLI flags don't cover this — pass a JSON body via positional file or piped stdin:

cat > pricing-monitor.json <<'EOF'
{
  "name": "Pricing watch",
  "goal": "Alert when plan prices or headline features change.",
  "schedule": { "text": "hourly", "timezone": "UTC" },
  "targets": [{
    "type": "scrape",
    "urls": ["https://example.com/pricing"],
    "scrapeOptions": {
      "formats": [{
        "type": "changeTracking",
        "modes": ["json"],
        "prompt": "Extract pricing tiers and headline features for each plan.",
        "schema": {
          "type": "object",
          "properties": {
            "plans": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "name":     { "type": "string" },
                  "price":    { "type": "string" },
                  "features": { "type": "array", "items": { "type": "string" } }
                }
              }
            }
          }
        }
      }]
    }
  }]
}
EOF
firecrawl monitor create pricing-monitor.json
# or: cat pricing-monitor.json | firecrawl monitor create

Each changed page in the check response then carries a per-field diff plus a snapshot of the current full extraction:

{
  "url": "https://example.com/pricing",
  "status": "changed",
  "diff": {
    "json": {
      "plans[0].price": { "previous": "$19/mo", "current": "$24/mo" },
      "plans[1].features[2]": {
        "previous": "10 GB storage",
        "current": "25 GB storage"
      }
    }
  },
  "snapshot": {
    "json": {
      "plans": [
        /* current full extraction */
      ]
    }
  }
}

Use modes: ["json", "git-diff"] for mixed mode — you get both diff.json (per-field) and diff.text (markdown sidecar), and the page is marked changed whenever either surface changed.

Tips

  • Prefer one monitor over repeated one-off scrapes whenever the user wants the same URL checked more than once.
  • Use --state paused (via update), not delete, when temporarily silencing a monitor.
  • --retention-days controls how long snapshots are kept for diffing. Lower it for high-frequency monitors to save storage.
  • External email recipients must opt in. First time they're added, Firecrawl sends a confirmation email and they only receive alerts after they confirm. Team-owned email addresses are auto-confirmed. Once a recipient unsubscribes, they must be re-added by the owner to get a fresh confirmation email.
  • firecrawl monitor run <id> triggers a check immediately — useful for smoke-testing a monitor right after creating it without waiting for the next scheduled run.
  • Filter check pages with --page-status changed (or new, removed, error) to skip the noise from same pages.
  • Use --page-status (not --status) when filtering check pages — --status is reserved for the global CLI status flag.
  • Monitor-triggered scrapes default maxAge to 0 — every check performs a fresh scrape unless scrapeOptions.maxAge is set explicitly in a JSON payload.

See also

firecrawl의 다른 스킬

oracle
firecrawl
oracle CLI 사용 모범 사례 (프롬프트 + 파일 번들링, 엔진, 세션 및 파일 첨부 패턴)
official
firecrawl-deep-research
firecrawl
Firecrawl을 사용하여 다중 소스 심층 연구를 실행합니다. 사용자가 주제를 조사하거나, 관점을 비교하거나, 출처가 포함된 브리핑을 작성하거나, 기술적 또는 시장 관련 질문을 조사하거나, 여러 소스의 웹 증거를 종합하도록 요청할 때 사용하세요.
officialresearchweb-scraping
firecrawl-research-papers
firecrawl
Firecrawl을 사용하여 연구 논문, 백서, PDF, 기술 보고서 및 학술 자료를 찾고 종합합니다. 사용자가 문헌 검토, 논문 요약, 연구 동향, 또는 PDF 및 학술/산업 간행물에서 출처가 포함된 종합 정보를 원할 때 사용하세요.
officialresearchweb-scraping
firecrawl-market-research
firecrawl
Firecrawl을 사용하여 시장, 재무, 실적, 산업 및 기업 지표를 추출합니다. 사용자가 시장 조사, 산업 동향, 상장 기업 데이터, 재무 비교, 실적 조사 또는 구조화된 시장 보고서를 요청할 때 사용하세요.
officialresearchweb-scraping
firecrawl-website-design-clone
firecrawl
Firecrawl 스크레이프 증거를 사용하여 모든 웹사이트의 디자인 시스템을 에이전트가 사용할 수 있는 DESIGN.md로 추출합니다. 사용자가 웹사이트의 색상, 글꼴, 간격, 구성 요소, 레이아웃 패턴 또는 브랜드/UI 가이드를 원할 때 사용하여 AI 에이전트가 새 웹사이트를 만들거나, 디자인을 복제하거나, 해당 디자인에서 영감을 받은 페이지를 구축할 수 있도록 합니다.
officialdesignweb-scraping
firecrawl-knowledge-base
firecrawl
Firecrawl을 사용하여 웹 콘텐츠로 지식 베이스를 구축하세요. 로컬 참조 문서, RAG 준비 청크, 파인튜닝 데이터셋, 문서 미러, 주제 코퍼스 또는 웹 소스에서 정리된 LLM 준비 마크다운에 사용할 수 있습니다.
officialweb-scrapingresearch
firecrawl-lead-research
firecrawl
Firecrawl을 사용하여 회의 전 리드 인텔리전스 브리핑을 생성합니다. 사용자가 영업 통화, 파트너십 회의, 투자자 대화 또는 고객 인터뷰 전에 회사 조사, 인물 조사, 최신 뉴스, 대화 포인트, 문제점 또는 아웃리치 준비가 필요할 때 사용합니다.
officialresearchweb-scraping
firecrawl-dashboard-reporting
firecrawl
Firecrawl 브라우저를 사용하여 분석 대시보드 및 내부 웹 도구에서 메트릭을 가져옵니다. 사용자가 대시보드 보고, 교차 플랫폼 메트릭 요약, 인증된 분석 추출, 날짜 범위 보고서 또는 웹 대시보드에서 구조화된 메트릭이 필요할 때 사용하세요.
officialbrowser-automationdata-analysis