Holdout Group Analysis

Run the holdout-group-setup drill to calculate holdout size (default 10%) and create the global-holdout PostHog feature flag with ensure_experience_continuity enabled.
Create PostHog cohorts for Holdout Group and Treatment Group.
Verify assignment stability: call the feature flag evaluation API 5 times per user for 10 test users; confirm every call returns the same group.
Query PostHog for 4 weeks of baseline data split by holdout vs treatment: weekly active users, retention (Week 1, 2, 4), engagement frequency.
Verify baseline parity: no primary metric differs >5% between groups. If parity fails, dissolve and recreate the holdout with a fresh flag key.
Store baseline snapshot in Attio. Communicate holdout policy to team: all future experiments must include holdout exclusion filter.
Run the threshold-engine drill: holdout flag active, cohort size within +/-2% of target, assignment stability 100%, baseline parity within 5%.
If PASS, proceed to Baseline. If FAIL, diagnose flag configuration or PostHog identification issues.