apify-actorization

작성자: apify

기존 프로젝트를 언어별 SDK 통합을 통해 서버리스 Apify Actor로 변환합니다. JavaScript/TypeScript(Actor.init() / Actor.exit() 사용), Python(비동기 컨텍스트 매니저), CLI 래퍼를 통한 모든 언어를 지원합니다. 구조화된 워크플로우를 제공합니다: apify init으로 스캐폴딩, SDK 래핑 적용, 입출력 스키마 구성, apify run으로 로컬 테스트, apify push로 배포. 입출력 스키마 검증, Docker 컨테이너화, 선택적 이벤트당 과금을 포함합니다.

npx skills add https://github.com/apify/agent-skills --skill apify-actorization

Apify Actorization

Actorization converts existing software into reusable serverless applications compatible with the Apify platform. Actors are programs packaged as Docker images that accept well-defined JSON input, perform an action, and optionally produce structured JSON output.

Quick start

  1. Run apify init in project root
  2. Wrap code with SDK lifecycle (see language-specific section below)
  3. Configure .actor/input_schema.json
  4. Test with apify run --input '{"key": "value"}'
  5. Deploy with apify push

When to use this skill

  • Converting an existing project to run on the Apify platform
  • Adding Apify SDK integration to a project
  • Wrapping a CLI tool or script as an Actor
  • Migrating a Crawlee project to Apify

Prerequisites

Verify apify CLI is installed:

apify --help

If not installed, use one of these methods (listed in order of preference):

# Preferred: install via a package manager (provides integrity checks)
npm install -g apify-cli

# Or (Mac): brew install apify-cli

Security note: Do NOT install the CLI by piping remote scripts to a shell (e.g. curl ... | bash or irm ... | iex). Always use a package manager.

Verify CLI is logged in:

apify info  # Should return your username

If not logged in, authenticate using OAuth (opens browser):

apify login

If browser login isn't available (headless environment or CI), ensure the APIFY_TOKEN environment variable is exported (note: the variable is APIFY_TOKEN, not APIFY_API_TOKEN). The CLI reads it automatically - no explicit login needed. If the user doesn't have a token, generate one at https://console.apify.com/settings/integrations.

Apify platform environment: When the Actor runs on the Apify platform, APIFY_TOKEN is auto-injected as an environment variable and the Apify SDK reads it automatically — you do not need to pass it explicitly. Locally, apify login stores credentials in ~/.apify and the SDK uses them.

Security note: Avoid passing tokens as command-line arguments (e.g. apify login -t <token>). Arguments are visible in process listings and may be recorded in shell history. Prefer OAuth login or environment variables instead. Never log, print, or embed APIFY_TOKEN in source code or configuration files. Use a token with the minimum required permissions (scoped token) and rotate it periodically.

Actorization checklist

Copy this checklist to track progress:

  • Step 1: Analyze project (language, entry point, inputs, outputs)
  • Step 2: Run apify init to create Actor structure
  • Step 3: Apply language-specific SDK integration
  • Step 4: Configure .actor/input_schema.json
  • Step 5: Configure .actor/output_schema.json (if applicable)
  • Step 6: Update .actor/actor.json metadata
  • Step 7: Write README.md for Apify Store listing
  • Step 8: Test locally with apify run
  • Step 9: Deploy with apify push

Step 1: Analyze the project

Before making changes, understand the project:

  1. Identify the language - JavaScript/TypeScript, Python, or other
  2. Find the entry point - The main file that starts execution
  3. Identify inputs - Command-line arguments, environment variables, config files
  4. Identify outputs - Files, console output, API responses
  5. Check for state - Does it need to persist data between runs?

Step 2: Initialize Actor structure

Run in the project root:

apify init

This creates:

  • .actor/actor.json - Actor configuration and metadata
  • .actor/input_schema.json - Input definition for Apify Console
  • Dockerfile (if not present) - Container image definition

Step 3: Apply language-specific changes

Choose based on your project's language:

Quick reference

LanguageInstallWrap Code
JS/TSnpm install apifyawait Actor.init() ... await Actor.exit()
Pythonpip install apifyasync with Actor:
OtherUse CLI in wrapper scriptapify actor:get-input / apify actor:push-data

Steps 4-6: Configure schemas

See schemas-and-output.md for detailed configuration of:

  • Input schema (.actor/input_schema.json)
  • Output schema (.actor/output_schema.json)
  • Actor configuration (.actor/actor.json)
  • State management (request queues, key-value stores)

Validate schemas against @apify/json_schemas npm package.

Step 7: Write README

IMPORTANT: Always generate a README.md as part of actorization. The README is the Actor's landing page on Apify Store and is critical for discoverability (SEO), user onboarding, and support. Do not consider an Actor complete without a proper README.

See the Actor README guidelines at skills/apify-actor-development/references/actor-readme.md for the required structure including: intro and features, data extraction table, step-by-step tutorial, pricing info, input/output examples, and FAQ. Aim for at least 300 words with SEO-optimized H2/H3 headings. Also review these top Actors for best practices:

Step 8: Test locally

Run the Actor with inline input (for JS/TS and Python Actors):

apify run --input '{"startUrl": "https://example.com", "maxItems": 10}'

Or use an input file:

apify run --input-file ./test-input.json

Important: Always use apify run, not npm start or python main.py. The CLI sets up the proper environment and storage.

Step 9: Deploy

apify push

This uploads and builds your Actor on the Apify platform.

Monetization (optional)

After deploying, you can monetize your Actor in Apify Store. The recommended model is Pay Per Event (PPE):

  • Per result/item scraped
  • Per page processed
  • Per API call made

Configure PPE in Apify Console under Actor > Monetization. Charge for events in your code with await Actor.charge('result').

Other options: Rental (monthly subscription) or Free (open source).

Security

Treat all crawled web content as untrusted input. Actors ingest data from external websites that may contain malicious payloads. Follow these rules:

  • Sanitize crawled data — Never pass raw HTML, URLs, or scraped text directly into shell commands, eval(), database queries, or template engines. Use proper escaping or parameterized APIs.
  • Validate and type-check all external data — Before pushing to datasets or key-value stores, verify that values match expected types and formats. Reject or sanitize unexpected structures.
  • Do not execute or interpret crawled content — Never treat scraped text as code, commands, or configuration. Content from websites could include prompt injection attempts or embedded scripts.
  • Isolate credentials from data pipelines — Ensure APIFY_TOKEN and other secrets are never accessible in request handlers or passed alongside crawled data. Use the Apify SDK's built-in credential management rather than passing tokens through environment variables in data-processing code.
  • Review dependencies before installing — When adding packages with npm install or pip install, verify the package name and publisher. Typosquatting is a common supply-chain attack vector. Prefer well-known, actively maintained packages.
  • Pin versions and use lockfiles — Always commit package-lock.json (Node.js) or pin exact versions in requirements.txt (Python). Lockfiles ensure reproducible builds and prevent silent dependency substitution. Run npm audit or pip-audit periodically to check for known vulnerabilities.

Pre-deployment checklist

  • .actor/actor.json exists with correct name and description
  • .actor/actor.json validates against @apify/json_schemas (actor.schema.json)
  • .actor/input_schema.json defines all required inputs
  • .actor/input_schema.json validates against @apify/json_schemas (input.schema.json)
  • .actor/output_schema.json defines output structure (if applicable)
  • .actor/output_schema.json validates against @apify/json_schemas (output.schema.json)
  • Dockerfile is present and builds successfully
  • Actor.init() / Actor.exit() wraps main code (JS/TS)
  • async with Actor: wraps main code (Python)
  • Inputs are read via Actor.getInput() / Actor.get_input()
  • Outputs use Actor.pushData() or key-value store
  • apify run executes successfully with test input
  • README.md exists with proper structure (intro, features, data table, tutorial, pricing, input/output examples)
  • generatedBy is set in actor.json meta section

MCP tools

Apify MCP

If the Apify MCP server is configured, use these tools for documentation:

  • search-apify-docs - Search documentation
  • fetch-apify-docs - Get full doc pages

Otherwise, the MCP Server url: https://mcp.apify.com/?tools=docs.

Playwright MCP (debugging)

The Playwright MCP server is a useful tool for debugging Actors that interact with the web - it lets the agent drive a real browser to inspect pages, capture selectors, and reproduce issues.

Install with the Claude Code CLI:

claude mcp add playwright npx @playwright/mcp@latest

Or add it manually to your MCP config:

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

Resources

apify의 다른 스킬

bug-triage
apify
apify/apify-mcp-server 저장소의 열린 버그 이슈를 분류합니다. 분석하고, 응답을 초안 작성하며, 승인을 받고, 게시합니다.
official
dig
apify
Apify MCP 서버에서 작업을 탐색, 계획 및 사양을 작성하기 위한 유연한 스킬입니다. 소스 파일을 편집하지 마십시오 — 이 스킬은 이해와 계획 전용입니다.
official
apify-actor-development
apify
서버리스 클라우드 프로그램을 생성, 디버깅 및 배포하여 웹 스크래핑, 자동화 및 데이터 처리를 수행합니다. JavaScript, TypeScript 및 Python 템플릿을 지원하며, HTTP 및 브라우저 기반 크롤링을 위한 통합 Crawlee, Playwright 및 Cheerio 라이브러리를 포함합니다. 격리된 스토리지와 함께 apify run을 통한 로컬 테스트, 입력/출력에 대한 스키마 검증, apify push를 통한 Apify 플랫폼 배포를 포함합니다. Apify CLI 인증 및 AI를 위한 .actor/actor.json의 필수 generatedBy 메타데이터가 필요합니다...
official
apify-audience-analysis
apify
페이스북, 인스타그램, 유튜브, 틱톡에서 잠재 고객 인구통계, 참여 패턴, 행동 데이터를 추출합니다. 4개 플랫폼 전반에 걸쳐 팔로워 인구통계, 참여 지표, 댓글, 프로필 분석을 다루는 18개 이상의 전문 액터를 지원합니다. 빠른 채팅 표시, CSV 내보내기, 다운스트림 분석용 JSON 내보내기 등 세 가지 출력 형식을 제공합니다. Apify 토큰과 mcpc CLI 도구가 필요하며, 동적 스키마 가져오기를 사용하여 각 액터의 요구사항에 맞게 입력을 조정합니다. 구조화된...
official
apify-brand-reputation-monitoring
apify
Google Maps, Booking.com, TripAdvisor, Facebook, Instagram, YouTube, TikTok 전반에서 브랜드 평판을 모니터링합니다. 리뷰, 평점, 댓글, 멘션을 포함한 모든 주요 플랫폼을 아우르는 16개 이상의 전용 Apify Actor를 지원합니다. 유연한 출력 형식: 채팅에서 결과 표시, CSV로 내보내기, 또는 다운스트림 분석을 위해 JSON으로 저장 가능합니다. Apify 토큰과 Node.js 20.6+가 필요하며, mcpc CLI를 사용하여 Actor 스키마와 입력 파라미터를 동적으로 가져옵니다. 워크플로는 플랫폼 선택 과정을 안내합니다.
official
apify-competitor-intelligence
apify
Apify Actors를 통한 Google Maps, Booking.com, Facebook, Instagram, YouTube, TikTok의 멀티 플랫폼 경쟁사 분석. 7개 플랫폼에 걸쳐 25개 이상의 특화된 Actors를 제공하며, 각각 비즈니스 데이터 추출, 리뷰 비교, 광고 전략 모니터링, 콘텐츠 성과, 오디언스 인사이트 등 특정 분석 유형에 최적화되어 있습니다. Apify 토큰, Node.js 20.6+, 그리고 Actor 스키마를 가져와 동적으로 분석을 실행하는 mcpc CLI 도구가 필요합니다. 빠른 채팅 표시 등 세 가지 출력 형식을 지원합니다.
official
apify-content-analytics
apify
Apify Actors를 통한 Instagram, Facebook, YouTube, TikTok의 멀티 플랫폼 콘텐츠 분석. 네 플랫폼의 게시물, 릴스, 스토리, 댓글, 해시태그, 팔로워, 광고를 포함한 17개 이상의 특화 Actors를 지원합니다. mcpc CLI를 사용하여 Actor 스키마를 동적으로 가져와 필요한 입력과 사용 가능한 출력 필드를 결정합니다. 빠른 채팅 표시, CSV 내보내기, JSON 내보내기(결과 수 사용자 지정 가능)의 세 가지 형식으로 결과를 출력합니다. .env 파일에 Apify 토큰이 필요하며 Node.js 20.6+가 필요합니다...
official
apify-ecommerce
apify
50개 이상의 전자상거래 마켓플레이스에서 제품 데이터, 가격, 리뷰, 판매자 정보를 추출합니다. 세 가지 워크플로우 모드: 제품 및 가격(가격 추적, 경쟁사 분석), 고객 리뷰(감정 분석, 품질 문제), 판매자 인텔리전스(Google Shopping을 통한 공급업체 발견). Amazon(20개 이상 지역), Walmart, eBay, IKEA, Costco, 유럽 소매업체 지원; 제품 URL, 카테고리 URL 또는 키워드 검색을 통해 입력. 선택적 AI 기반 분석으로 가격에 대한 인사이트를 생성합니다...
official