creating-experiments

от posthog

Проводит агентов через 3-этапный процесс создания эксперимента: определение гипотезы, настройка развертывания и настройка аналитики. Делегирует решения по развертыванию…

npx skills add https://github.com/posthog/ai-plugin --skill creating-experiments

Creating experiments

This skill walks through the 3-step flow for creating a new A/B test experiment.

Core principle: draft first, iterate on details

Create the experiment as a draft quickly, then iterate on metrics and configuration. The user gets a tangible draft immediately and can refine it.

The 3-step creation flow

Step 1: What are we testing?

Gather these before calling experiment-create:

  • Experiment name — descriptive, inferred from context when possible
  • Hypothesis — what you expect to happen (goes in description)
  • Feature flag key — kebab-case. Ask if they want a new flag or to reuse an existing one. The flag is auto-created — do NOT create one separately.
  • Type — leave empty (will internally default to "product". The "web" value is reserved for no-code experiments configured visually with the PostHog toolbar in a browser; it cannot be meaningfully driven via MCP. If a user asks for a no-code/toolbar experiment, point them to the PostHog UI instead of creating one here.)

If the user gives enough context to infer these, don't ask — just proceed.

Step 2: Who sees what variant?

This is about rollout configuration.

Before asking any rollout question, load configuring-experiment-rollout. The disambiguation wording, recommendations, and post-answer branches live there — do not formulate rollout questions yourself, and do not assume an example you remember covers the user's path.

Key decision points (covered in detail by configuring-experiment-rollout):

  • Variant split (how many variants, what percentage each)
  • Overall rollout percentage (what % of all users enter the experiment)
  • Whether to persist the flag across authentication steps

If the user doesn't mention rollout specifics, use defaults: 50/50 control/test, 100% rollout.

Step 3: How to measure impact?

This is about analytics and metrics. Load the configuring-experiment-analytics skill for guidance.

Do NOT configure metrics on creation. Metrics are not passed to experiment-create — they are added afterwards via experiment-update. This keeps the creation call lightweight.

When the user specifies metrics upfront, acknowledge them and add them immediately after creation. When they don't, create the draft and then guide them through metric setup as a follow-up.

How to create

Call experiment-create with:

{
  "name": "Descriptive experiment name",
  "feature_flag_key": "kebab-case-key",
  "description": "Hypothesis: [what you expect to happen]",
  "parameters": {
    "feature_flag_variants": [
      { "key": "control", "name": "Control", "split_percent": 50 },
      { "key": "test", "name": "Test", "split_percent": 50 }
    ],
    "rollout_percentage": 100
  }
}

Two different percentages — do NOT mix them up:

  • feature_flag_variants[].split_percent — how users inside the experiment are split across variants (must sum to 100, recommended to have an even split).
  • parameters.rollout_percentage — what fraction of all users enter the experiment at all (0-100, defaults to 100).

Key details:

  • First variant must have key "control". Minimum 2, maximum 20 variants.
  • rollout_percentage defaults to 100 if omitted.
  • Stats default to Bayesian. Only set stats_config if the user requests Frequentist.

After creation

  1. Always show the experiment URL. The experiment-create response includes _posthogUrl — always display this link so the user can view and configure the experiment in the UI.

  2. Remind the user to implement the feature flag in code. Link to the experiment page and say "implement the flag as shown here" — the experiment detail page shows implementation snippets for the user's SDK.

  3. Guide through metrics if not yet configured — load the configuring-experiment-analytics skill.

  4. Launch when ready — use the experiment-launch tool.

Больше skills от posthog

error-tracking-go
posthog
Отслеживание ошибок PostHog для Go
official
integration-laravel
posthog
Интеграция PostHog для приложений Laravel
official
integration-nextjs-app-router
posthog
Интеграция PostHog для приложений Next.js App Router
official
logs-other
posthog
Логи PostHog для других языков
official
logs-python
posthog
Логи PostHog для Python
official
analyzing-experiment-session-replays
posthog
Анализировать паттерны повторных сессий в вариантах эксперимента, чтобы понять различия в поведении пользователей. Используйте, когда пользователь хочет увидеть, как пользователи взаимодействуют с…
official
auditing-experiments-flags
posthog
Аудит экспериментов и функциональных флагов PostHog на предмет проблем с конфигурацией, устаревания и нарушений лучших практик. Читать, когда пользователь просит провести аудит, проверку работоспособности,…
official
auditing-warehouse-data-health
posthog
Этот навык создаёт общепроектный аудит конвейера хранилища данных. Используйте его, когда пользователю нужна сводка всего, что сломано, а не глубокий анализ одного синхронизации. Глубокий анализ отдельных сбоев выполняется навыком diagnosing-failed-warehouse-syncs; этот навык — сканирование, которое подсказывает, куда смотреть в первую очередь.
official