apify-actorization

作者: apify

將現有專案轉換為無伺服器 Apify Actors,並整合語言專屬 SDK。支援 JavaScript/TypeScript(使用 Actor.init() / Actor.exit())、Python(非同步上下文管理器),以及透過 CLI 包裝器的任何語言。提供結構化工作流程:使用 apify init 建立專案骨架、套用 SDK 包裝、設定輸入/輸出架構、以 apify run 進行本地測試,再透過 apify push 部署。包含輸入與輸出架構驗證、Docker 容器化,以及可選的按事件付費...

npx skills add https://github.com/apify/agent-skills --skill apify-actorization

Apify Actorization

Actorization converts existing software into reusable serverless applications compatible with the Apify platform. Actors are programs packaged as Docker images that accept well-defined JSON input, perform an action, and optionally produce structured JSON output.

Quick start

  1. Run apify init in project root
  2. Wrap code with SDK lifecycle (see language-specific section below)
  3. Configure .actor/input_schema.json
  4. Test with apify run --input '{"key": "value"}'
  5. Deploy with apify push

When to use this skill

  • Converting an existing project to run on the Apify platform
  • Adding Apify SDK integration to a project
  • Wrapping a CLI tool or script as an Actor
  • Migrating a Crawlee project to Apify

Prerequisites

Verify apify CLI is installed:

apify --help

If not installed, use one of these methods (listed in order of preference):

# Preferred: install via a package manager (provides integrity checks)
npm install -g apify-cli

# Or (Mac): brew install apify-cli

Security note: Do NOT install the CLI by piping remote scripts to a shell (e.g. curl ... | bash or irm ... | iex). Always use a package manager.

Verify CLI is logged in:

apify info  # Should return your username

If not logged in, authenticate using OAuth (opens browser):

apify login

If browser login isn't available (headless environment or CI), ensure the APIFY_TOKEN environment variable is exported (note: the variable is APIFY_TOKEN, not APIFY_API_TOKEN). The CLI reads it automatically - no explicit login needed. If the user doesn't have a token, generate one at https://console.apify.com/settings/integrations.

Apify platform environment: When the Actor runs on the Apify platform, APIFY_TOKEN is auto-injected as an environment variable and the Apify SDK reads it automatically — you do not need to pass it explicitly. Locally, apify login stores credentials in ~/.apify and the SDK uses them.

Security note: Avoid passing tokens as command-line arguments (e.g. apify login -t <token>). Arguments are visible in process listings and may be recorded in shell history. Prefer OAuth login or environment variables instead. Never log, print, or embed APIFY_TOKEN in source code or configuration files. Use a token with the minimum required permissions (scoped token) and rotate it periodically.

Actorization checklist

Copy this checklist to track progress:

  • Step 1: Analyze project (language, entry point, inputs, outputs)
  • Step 2: Run apify init to create Actor structure
  • Step 3: Apply language-specific SDK integration
  • Step 4: Configure .actor/input_schema.json
  • Step 5: Configure .actor/output_schema.json (if applicable)
  • Step 6: Update .actor/actor.json metadata
  • Step 7: Write README.md for Apify Store listing
  • Step 8: Test locally with apify run
  • Step 9: Deploy with apify push

Step 1: Analyze the project

Before making changes, understand the project:

  1. Identify the language - JavaScript/TypeScript, Python, or other
  2. Find the entry point - The main file that starts execution
  3. Identify inputs - Command-line arguments, environment variables, config files
  4. Identify outputs - Files, console output, API responses
  5. Check for state - Does it need to persist data between runs?

Step 2: Initialize Actor structure

Run in the project root:

apify init

This creates:

  • .actor/actor.json - Actor configuration and metadata
  • .actor/input_schema.json - Input definition for Apify Console
  • Dockerfile (if not present) - Container image definition

Step 3: Apply language-specific changes

Choose based on your project's language:

Quick reference

LanguageInstallWrap Code
JS/TSnpm install apifyawait Actor.init() ... await Actor.exit()
Pythonpip install apifyasync with Actor:
OtherUse CLI in wrapper scriptapify actor:get-input / apify actor:push-data

Steps 4-6: Configure schemas

See schemas-and-output.md for detailed configuration of:

  • Input schema (.actor/input_schema.json)
  • Output schema (.actor/output_schema.json)
  • Actor configuration (.actor/actor.json)
  • State management (request queues, key-value stores)

Validate schemas against @apify/json_schemas npm package.

Step 7: Write README

IMPORTANT: Always generate a README.md as part of actorization. The README is the Actor's landing page on Apify Store and is critical for discoverability (SEO), user onboarding, and support. Do not consider an Actor complete without a proper README.

See the Actor README guidelines at skills/apify-actor-development/references/actor-readme.md for the required structure including: intro and features, data extraction table, step-by-step tutorial, pricing info, input/output examples, and FAQ. Aim for at least 300 words with SEO-optimized H2/H3 headings. Also review these top Actors for best practices:

Step 8: Test locally

Run the Actor with inline input (for JS/TS and Python Actors):

apify run --input '{"startUrl": "https://example.com", "maxItems": 10}'

Or use an input file:

apify run --input-file ./test-input.json

Important: Always use apify run, not npm start or python main.py. The CLI sets up the proper environment and storage.

Step 9: Deploy

apify push

This uploads and builds your Actor on the Apify platform.

Monetization (optional)

After deploying, you can monetize your Actor in Apify Store. The recommended model is Pay Per Event (PPE):

  • Per result/item scraped
  • Per page processed
  • Per API call made

Configure PPE in Apify Console under Actor > Monetization. Charge for events in your code with await Actor.charge('result').

Other options: Rental (monthly subscription) or Free (open source).

Security

Treat all crawled web content as untrusted input. Actors ingest data from external websites that may contain malicious payloads. Follow these rules:

  • Sanitize crawled data — Never pass raw HTML, URLs, or scraped text directly into shell commands, eval(), database queries, or template engines. Use proper escaping or parameterized APIs.
  • Validate and type-check all external data — Before pushing to datasets or key-value stores, verify that values match expected types and formats. Reject or sanitize unexpected structures.
  • Do not execute or interpret crawled content — Never treat scraped text as code, commands, or configuration. Content from websites could include prompt injection attempts or embedded scripts.
  • Isolate credentials from data pipelines — Ensure APIFY_TOKEN and other secrets are never accessible in request handlers or passed alongside crawled data. Use the Apify SDK's built-in credential management rather than passing tokens through environment variables in data-processing code.
  • Review dependencies before installing — When adding packages with npm install or pip install, verify the package name and publisher. Typosquatting is a common supply-chain attack vector. Prefer well-known, actively maintained packages.
  • Pin versions and use lockfiles — Always commit package-lock.json (Node.js) or pin exact versions in requirements.txt (Python). Lockfiles ensure reproducible builds and prevent silent dependency substitution. Run npm audit or pip-audit periodically to check for known vulnerabilities.

Pre-deployment checklist

  • .actor/actor.json exists with correct name and description
  • .actor/actor.json validates against @apify/json_schemas (actor.schema.json)
  • .actor/input_schema.json defines all required inputs
  • .actor/input_schema.json validates against @apify/json_schemas (input.schema.json)
  • .actor/output_schema.json defines output structure (if applicable)
  • .actor/output_schema.json validates against @apify/json_schemas (output.schema.json)
  • Dockerfile is present and builds successfully
  • Actor.init() / Actor.exit() wraps main code (JS/TS)
  • async with Actor: wraps main code (Python)
  • Inputs are read via Actor.getInput() / Actor.get_input()
  • Outputs use Actor.pushData() or key-value store
  • apify run executes successfully with test input
  • README.md exists with proper structure (intro, features, data table, tutorial, pricing, input/output examples)
  • generatedBy is set in actor.json meta section

MCP tools

Apify MCP

If the Apify MCP server is configured, use these tools for documentation:

  • search-apify-docs - Search documentation
  • fetch-apify-docs - Get full doc pages

Otherwise, the MCP Server url: https://mcp.apify.com/?tools=docs.

Playwright MCP (debugging)

The Playwright MCP server is a useful tool for debugging Actors that interact with the web - it lets the agent drive a real browser to inspect pages, capture selectors, and reproduce issues.

Install with the Claude Code CLI:

claude mcp add playwright npx @playwright/mcp@latest

Or add it manually to your MCP config:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

Resources

來自 apify 的更多技能

bug-triage
apify
分類處理 apify/apify-mcp-server 上的未解決錯誤問題。分析、草擬回覆、取得核准、發布。
official
dig
apify
用於在 Apify MCP 伺服器上探索、規劃與規格化工作的靈活技能。請勿編輯原始檔案——此技能僅供理解與規劃使用。
official
apify-actor-development
apify
建立、除錯及部署無伺服器雲端程式,用於網頁爬取、自動化及資料處理。支援 JavaScript、TypeScript 及 Python 範本,內建 Crawlee、Playwright 與 Cheerio 函式庫,適用於 HTTP 及瀏覽器爬取。包含透過 apify run 進行本地測試(具備隔離儲存)、輸入/輸出結構驗證,以及透過 apify push 部署至 Apify 平台。需進行 Apify CLI 驗證,並在 .actor/actor.json 中強制加入 generatedBy 元資料以供 AI 使用...
official
apify-audience-analysis
apify
從Facebook、Instagram、YouTube和TikTok提取受眾人口統計、互動模式及行為數據。支援18個以上專業Actor,涵蓋四個平台的粉絲人口統計、互動指標、留言及個人檔案分析。提供三種輸出格式:快速聊天顯示、CSV匯出或JSON匯出供後續分析。需使用Apify token及mcpc CLI工具;透過動態架構擷取來調整輸入以符合各Actor需求。包含結構化...
official
apify-brand-reputation-monitoring
apify
監控品牌在Google Maps、Booking.com、TripAdvisor、Facebook、Instagram、YouTube和TikTok上的聲譽。支援16個以上的專用Apify Actors,涵蓋所有主要平台的評論、評分、留言和提及。靈活的輸出格式:在聊天中顯示結果、匯出為CSV,或儲存為JSON供後續分析使用。需要Apify token和Node.js 20.6+;使用mcpc CLI動態擷取Actor架構和輸入參數。工作流程引導使用者選擇平台...
official
apify-competitor-intelligence
apify
透過 Apify Actors 進行多平台競爭對手分析,涵蓋 Google Maps、Booking.com、Facebook、Instagram、YouTube 及 TikTok。包含 25 個以上專用 Actors,橫跨七大平台,每個皆針對特定分析類型最佳化:商業資料擷取、評論比較、廣告策略監控、內容成效及受眾洞察。需具備 Apify 權杖、Node.js 20.6+ 及 mcpc CLI 工具,以動態擷取 Actor 架構並執行分析。支援三種輸出格式:快速聊天顯示、...
official
apify-content-analytics
apify
透過 Apify Actors 進行多平台內容分析,支援 Instagram、Facebook、YouTube 及 TikTok。涵蓋 17 種以上專用 Actors,可處理貼文、Reels、限時動態、留言、Hashtag、粉絲及廣告等內容,並動態使用 mcpc CLI 擷取 Actor 架構,以判斷所需輸入與可用輸出欄位。結果提供三種格式:快速聊天顯示、CSV 匯出或 JSON 匯出,並可自訂結果數量。需在 .env 檔案中設定 Apify Token,並使用 Node.js 20.6+...
official
apify-ecommerce
apify
從50多個電子商務平台提取產品數據、價格、評論及賣家資訊。三種工作流程模式:產品與定價(價格追蹤、競爭對手分析)、客戶評論(情感分析、品質問題)及賣家情報(透過Google Shopping發現供應商)。支援Amazon(20多個地區)、Walmart、eBay、IKEA、Costco及歐洲零售商;可透過產品網址、分類網址或關鍵字搜尋輸入。可選AI驅動分析,生成價格洞察...
official