apify-ultimate-scraper

от apify

Автоматизированный веб-скрапер, выбирающий оптимальные Акторы для 55+ платформ, включая Instagram, TikTok, YouTube, Facebook, Google Maps и другие. Охватывает 55+ предварительно настроенных Акторов на 8 основных платформах с рекомендациями по выбору в зависимости от сценария использования (генерация лидов, поиск инфлюенсеров, мониторинг бренда, анализ конкурентов, исследование трендов). Поддерживает три формата вывода: быстрый чат-дисплей, экспорт в CSV или JSON с настраиваемыми лимитами результатов. Включает многоАкторные шаблоны рабочих процессов для сложных...

npx skills add https://github.com/apify/agent-skills --skill apify-ultimate-scraper

Скачать ZIP GitHub

2.2k

Universal web scraper

AI-driven data extraction from ~100 Actors across 15+ platforms via the Apify CLI.

Rules for every apify command:

Pass --json for machine-readable output (stable across CLI versions).
Pass --user-agent apify-agent-skills/apify-ultimate-scraper for telemetry attribution.
Redirect stderr with 2>/dev/null (stderr contains progress messages that break JSON parsers).

Prerequisites

Apify CLI v1.5.0+ (npm install -g apify-cli)
Authenticated session (see below)

Authentication

If a CLI command fails with an auth error, authenticate using one of these methods:

OAuth (interactive): apify login (opens browser)
Environment variable: export APIFY_TOKEN=your_token_here
From .env file: source .env (if the file contains APIFY_TOKEN=...)

Generate token: https://console.apify.com/settings/integrations

Workflow

Step 1: Understand goal and select Actor

Identify the target platform and use case. Read references/actor-index.md to find the right Actor.

If the task involves a multi-step pipeline, also read the matching workflow guide:

Task involves...	Read
leads, contacts, emails, B2B	`references/workflows/lead-generation.md`
competitor, ads, pricing	`references/workflows/competitive-intel.md`
influencer, creator	`references/workflows/influencer-vetting.md`
brand, mentions, sentiment	`references/workflows/brand-monitoring.md`
reviews, ratings, reputation	`references/workflows/review-analysis.md`
SEO, SERP, crawl, content, RAG	`references/workflows/content-and-seo.md`
analytics, engagement, performance	`references/workflows/social-media-analytics.md`
trends, keywords, hashtags	`references/workflows/trend-research.md`
jobs, recruiting, candidates	`references/workflows/job-market-and-recruitment.md`
real estate, listings, hotels	`references/workflows/real-estate-and-hospitality.md`
price monitoring, e-commerce, products	`references/workflows/ecommerce-price-monitoring.md`
contact enrichment, email extraction	`references/workflows/contact-enrichment.md`
knowledge base, RAG, LLM data feed	`references/workflows/knowledge-base-and-rag.md`
company research, due diligence	`references/workflows/company-research.md`

If no Actor matches in the index, search dynamically:

apify actors search "KEYWORDS" --user-agent apify-agent-skills/apify-ultimate-scraper --json --limit 10 2>/dev/null

From results: items[].username/items[].name (Actor ID), items[].title, items[].stats.totalUsers30Days, items[].currentPricingInfo.pricingModel.

Step 2: Fetch Actor schema and check gotchas

Fetch the input schema dynamically:

apify actors info "ACTOR_ID" --user-agent apify-agent-skills/apify-ultimate-scraper --input --json 2>/dev/null

Also read references/gotchas.md to check for common pitfalls for the selected Actor.

For Actor documentation: apify actors info "ACTOR_ID" --user-agent apify-agent-skills/apify-ultimate-scraper --readme

Step 3: Configure and run

Skip user preferences for simple lookups (e.g., "Nike's follower count"). Go straight to running with quick answer mode.

For larger tasks, confirm output format (quick answer / CSV / JSON) and result count.

Standard run (blocking):

apify actors call "ACTOR_ID" --input-file input.json --user-agent apify-agent-skills/apify-ultimate-scraper --json 2>/dev/null

Prefer --input-file input.json for large or complex inputs. For tiny inputs, inline JSON is acceptable with shell quoting: --input '{"maxItems":10}'.

From output: .id (run ID), .status, .defaultDatasetId, .stats.durationMillis

Fetch results:

apify datasets get-items DATASET_ID --user-agent apify-agent-skills/apify-ultimate-scraper --format json

For CSV: apify datasets get-items DATASET_ID --user-agent apify-agent-skills/apify-ultimate-scraper --format csv

Quick answer mode: Fetch results as JSON, pick top 5, present formatted in chat.

Save to file: Fetch results, use Write tool to save as YYYY-MM-DD_descriptive-name.csv or .json.

Large/long-running scrapes:

apify actors start "ACTOR_ID" --input-file input.json --user-agent apify-agent-skills/apify-ultimate-scraper --json 2>/dev/null

Poll: apify runs info RUN_ID --user-agent apify-agent-skills/apify-ultimate-scraper --json 2>/dev/null (check .status for SUCCEEDED).

Step 4: Deliver results

Report: result count, file location (if saved), key data fields, and links:

Dataset: https://console.apify.com/storage/datasets/DATASET_ID
Run: https://console.apify.com/actors/runs/RUN_ID

For multi-step workflows: suggest the next pipeline step from the workflow guide.

Troubleshooting

Common errors and pitfalls are documented in references/gotchas.md. Read it before running PPE (pay-per-event) Actors.

Больше skills от apify

bug-triage

apify

Выполнять триаж открытых баг-репортов в apify/apify-mcp-server. Анализировать, составлять черновики ответов, получать одобрение, публиковать.

official

dig

apify

Гибкий навык для изучения, планирования и спецификации работы на сервере Apify MCP. НЕ редактируйте исходные файлы — этот навык предназначен только для понимания и планирования.

official

apify-actor-development

apify

Создавайте, отлаживайте и развертывайте серверные облачные программы для веб-скрапинга, автоматизации и обработки данных. Поддерживает шаблоны JavaScript, TypeScript и Python с интегрированными библиотеками Crawlee, Playwright и Cheerio для HTTP- и браузерного краулинга. Включает локальное тестирование через apify run с изолированным хранилищем, проверку схемы для входных/выходных данных и развертывание на платформе Apify через apify push. Требуется аутентификация Apify CLI и обязательные метаданные generatedBy в .actor/actor.json для AI...

official

apify-actorization

apify

Преобразуйте существующие проекты в бессерверные Apify Actors с интеграцией SDK для конкретного языка. Поддерживает JavaScript/TypeScript (с Actor.init() / Actor.exit()), Python (асинхронный контекстный менеджер) и любой язык через CLI-обёртку. Предоставляет структурированный рабочий процесс: apify init для создания каркаса, применение SDK-обёртки, настройка схем ввода/вывода, локальное тестирование с apify run, затем развёртывание с apify push. Включает валидацию схем ввода и вывода, контейнеризацию Docker и опциональную оплату за событие...

official

apify-audience-analysis

apify

We need to translate the given text from English to Russian, preserving the name "apify-audience-analysis" if it appears. The name does not appear in the text, so we don't include it. We must not add any extra commentary, labels, or formatting. Just translate the text as is. The text: "Extract audience demographics, engagement patterns, and behavior data from Facebook, Instagram, YouTube, and TikTok. Supports 18+ specialized Actors covering follower demographics, engagement metrics, comments, and profile analysis across all four platforms Offers three output formats: quick chat display, CSV export, or JSON export for downstream analysis Requires Apify token and mcpc CLI tool; uses dynamic schema fetching to adapt inputs to each Actor's requirements Includes structured..." We need to translate accurately. Note: "Actors" is a term from Apify (likely capitalized). Keep as "Акторы" or "Actors"? Since it's a product-specific term, we might keep it as "Actors" or translate? The instruction says "Preserve product names, protocol names, URLs

official

apify-brand-reputation-monitoring

apify

Мониторинг репутации бренда на Google Maps, Booking.com, TripAdvisor, Facebook, Instagram, YouTube и TikTok. Поддерживает 16+ специализированных Apify Actors для сбора отзывов, оценок, комментариев и упоминаний на всех основных платформах. Гибкие форматы вывода: отображение результатов в чате, экспорт в CSV или сохранение в JSON для последующего анализа. Требуется Apify токен и Node.js 20.6+; используется mcpc CLI для динамического получения схем и входных параметров Actor. Рабочий процесс проводит пользователя через выбор платформы,...

official

apify-competitor-intelligence

apify

Многоплатформенный конкурентный анализ через Apify Actors для Google Maps, Booking.com, Facebook, Instagram, YouTube и TikTok. Охватывает 25+ специализированных Actors на семи платформах, каждый оптимизирован для определенных типов анализа: извлечение бизнес-данных, сравнение отзывов, мониторинг рекламных стратегий, анализ контента и аудитории. Требуется Apify токен, Node.js 20.6+ и инструмент mcpc CLI для получения схем Actors и динамического выполнения анализов. Поддерживает три формата вывода: быстрый чат-дисплей,...

official

apify-content-analytics

apify

Мультиплатформенный анализ контента через Apify Actors для Instagram, Facebook, YouTube и TikTok. Поддерживает 17+ специализированных Actors, охватывающих посты, рилсы, истории, комментарии, хештеги, подписчиков и рекламу на всех четырех платформах. Динамически получает схемы Actors с помощью mcpc CLI для определения необходимых входных данных и доступных полей вывода. Выводит результаты в трех форматах: быстрый чат-дисплей, экспорт в CSV или экспорт в JSON с настраиваемым количеством результатов. Требует токен Apify в файле .env и Node.js 20.6+...

official