Autonomous Optimization

This is the drill that makes the Durable level fundamentally different from Scalable. Instead of running a play and measuring results, this drill creates an always-on agent loop that:

Monitors — detects when metrics plateau, drop, or spike
Diagnoses — generates hypotheses for what to change
Experiments — designs and runs A/B tests on the top hypothesis
Decides — evaluates results and auto-implements winners
Reports — generates weekly executive summaries of what changed and why

The goal is to find the local maximum of each play — the best possible performance given the current market, audience, and competitive landscape — and maintain it as conditions change.

Input

A play that has been running at Scalable level for at least 4 weeks (baseline data required)
PostHog tracking configured with the play's core events
n8n instance for scheduling the optimization loop
Anthropic API key for Claude (hypothesis generation + evaluation)

The Optimization Loop

Phase 1: Monitor (runs daily via n8n cron)

Build an n8n workflow triggered by a daily cron schedule:

Use posthog-anomaly-detection to check the play's primary KPIs
Compare last 2 weeks against 4-week rolling average
Classify: normal (within ±10%), plateau (±2% for 3+ weeks), drop (>20% decline), spike (>50% increase)
If normal → log to Attio, no action needed
If anomaly detected → trigger Phase 2

Phase 2: Diagnose (triggered by anomaly detection)

Gather context: pull the play's current configuration from Attio (targeting, messaging, cadence, channel mix)
Pull 8-week metric history from PostHog using posthog-dashboards
Run hypothesis-generation with the anomaly data + context
Receive 3 ranked hypotheses with expected impact and risk levels
Store hypotheses in Attio as notes on the play's campaign record
If the top hypothesis has risk = "high" → send Slack alert for human review and STOP
If risk = "low" or "medium" → proceed to Phase 3

Phase 3: Experiment (triggered by hypothesis acceptance)

Take the top-ranked hypothesis
Design the experiment: use posthog-experiments to create a feature flag that splits traffic between control (current) and variant (hypothesis change)
Implement the variant using the appropriate fundamental (e.g., if the hypothesis is "change email subject line," use loops-sequences or instantly-campaign to create the B variant)
Set the experiment duration: minimum 7 days or until 100+ samples per variant, whichever is longer
Log the experiment start in Attio with: hypothesis, start date, expected duration, success criteria

Phase 4: Evaluate (triggered by experiment completion)

Pull experiment results from PostHog
Run experiment-evaluation with control vs variant data
Decision:
- Adopt: Update the live configuration to use the winning variant. Log the change. Move to Phase 5.
- Iterate: Generate a new hypothesis building on this result. Return to Phase 2.
- Revert: Disable the variant, restore control. Log the failure. Return to Phase 1 monitoring.
- Extend: Keep the experiment running for another period. Set a reminder.
Store the full evaluation (decision, confidence, reasoning) in Attio

Phase 5: Report (runs weekly via n8n cron)

Aggregate all optimization activity for the week: anomalies detected, hypotheses generated, experiments run, decisions made
Calculate: net metric change from all adopted changes this week
Generate a weekly optimization brief using Claude:
- What changed and why
- Net impact on primary KPIs
- Current distance from estimated local maximum
- Recommended focus for next week
Post the brief to Slack and store in Attio

Guardrails (CRITICAL)

Rate limit: Maximum 1 active experiment per play at a time. Never stack experiments.
Revert threshold: If primary metric drops >30% at any point during an experiment, auto-revert immediately.
Human approval required for:
- Budget changes >20%
- Audience/targeting changes that affect >50% of traffic
- Any change the hypothesis generator flags as "high risk"
Cooldown: After a failed experiment (revert), wait 7 days before testing a new hypothesis on the same variable.
Maximum experiments per month: 4 per play. If all 4 fail, pause optimization and flag for human strategic review.
Never optimize what isn't measured: If a KPI doesn't have PostHog tracking, fix tracking first (use posthog-gtm-events drill) before running experiments on it.

Output

Continuous metric monitoring with anomaly alerts
Automated hypothesis → experiment → evaluation → implementation cycle
Weekly optimization briefs
Audit trail of every change, why it was made, and what happened

When to Stop

The optimization loop runs indefinitely at Durable level. However, it should detect convergence — when successive experiments produce diminishing returns (<2% improvement for 3 consecutive experiments). At convergence:

The play has reached its local maximum
Reduce monitoring frequency from daily to weekly
Report to the team: "This play is optimized. Current performance is [metrics]. Further gains require strategic changes (new channels, new audience, product changes) rather than tactical optimization."

Autonomous Optimization

What this drill teaches

Autonomous Optimization

Input

The Optimization Loop

Phase 1: Monitor (runs daily via n8n cron)

Phase 2: Diagnose (triggered by anomaly detection)

Phase 3: Experiment (triggered by hypothesis acceptance)

Phase 4: Evaluate (triggered by experiment completion)

Phase 5: Report (runs weekly via n8n cron)

Guardrails (CRITICAL)

Output

When to Stop

Related Fundamentals