Analyze A/B Test Results and Generate Statistical Recommendations

Get a complete A/B test analysis with statistical significance, power analysis, sanity checks, and ship/no-ship recommendations.

📝 The Prompt

You are an expert data analyst and statistician specializing in experimentation and A/B testing. Analyze the following A/B test results and provide a thorough, actionable report. **Experiment Details:** - Experiment name: [EXPERIMENT_NAME, e.g., 'Checkout Button Color Change'] - Hypothesis: [HYPOTHESIS, e.g., 'Changing the checkout button from gray to green will increase conversion rate by at least 5%'] - Primary metric: [PRIMARY_METRIC, e.g., conversion rate, click-through rate, revenue per user] - Secondary metrics: [SECONDARY_METRICS, e.g., bounce rate, average order value, time on page] - Test duration: [DURATION, e.g., 14 days] - Traffic split: [SPLIT, e.g., 50/50] **Observed Data:** - Control group: [CONTROL_SAMPLE_SIZE] users, [CONTROL_CONVERSIONS] conversions - Variant group: [VARIANT_SAMPLE_SIZE] users, [VARIANT_CONVERSIONS] conversions - Any secondary metric observations: [SECONDARY_METRIC_DATA] **Please deliver the following analysis:** 1. **Statistical Significance Test**: Calculate the p-value, confidence interval (95%), and relative lift. State whether the result is statistically significant and explain what that means in plain language for non-technical stakeholders. 2. **Power Analysis**: Assess whether the sample size was sufficient to detect the hypothesized minimum detectable effect (MDE). If underpowered, calculate the required sample size and additional runtime needed. 3. **Sanity Checks**: List and evaluate at least 4 sanity checks (e.g., sample ratio mismatch, novelty effect, day-of-week bias, segment imbalance) and flag any concerns. 4. **Segment Analysis**: Suggest 5 meaningful segments to break down results by (e.g., device type, new vs. returning users, geography) and explain what to look for in each. 5. **Practical Significance Assessment**: Beyond statistical significance, evaluate whether the observed effect size is practically meaningful for the business. Consider [BUSINESS_CONTEXT, e.g., current revenue, traffic levels]. 6. **Recommendation**: Provide a clear ship/don't ship/continue testing recommendation with supporting rationale. 7. **Executive Summary**: Write a 3-4 sentence summary suitable for sharing with leadership. Format the output with clear sections, use tables for numerical comparisons, and highlight key numbers in bold.

💡 Tips for Better Results

Always include your raw numbers (sample sizes and conversion counts) rather than just percentages — this enables accurate statistical calculations.
Mention any known issues during the test period (e.g., site outages, marketing campaigns, holidays) so the AI can factor potential confounds into its analysis.
Ask for the analysis in both technical and non-technical language so you can use the output for both your data team review and stakeholder presentations.