A/B Test Significance Calculator

Enter visitors and conversions for variants A (control) and B (treatment) to estimate whether B is a statistically significant winner.

Experiment Inputs

How to use

Enter visitors and conversions for variants A and B. Choose a significance level (α), a test type, and whether to check a basic sample ratio mismatch (SRM). Click Calculate to see the p-value, uplift, and interpretation.

Methodology

This tool uses the classic two-sample z-test for proportions with a pooled variance estimate. It computes conversion rates, uplift, z-score, and p-values for the chosen test type.

  • Visitors and conversions must be integer counts.
  • Conversions must be between 0 and visitors for each variant.
  • A simple SRM warning triggers when traffic split is far from 50 / 50.

How this A/B test significance calculator works

We model each visitor as an independent Bernoulli trial. The calculator computes conversion rates for A and B, uses the pooled rate to estimate the standard error, and derives a z-score and p-value from the standard normal distribution. See the formulas section for the exact equations.

When is variant B “statistically significant”?

For a chosen significance level α (commonly 0.05), the calculator compares p-value to α and highlights whether B appears to be a winner, a loser, or inconclusive.

Uplift and 95% confidence interval

The calculator reports both absolute uplift and relative uplift (when variant A has at least one conversion). It also estimates a 95% confidence interval for absolute uplift using the same standard error as the z-test.

Sample ratio mismatch (SRM) warning

If you choose the “roughly 50 / 50” option, the calculator checks whether the observed traffic split is far from 50 / 50. Large discrepancies can indicate assignment or tracking problems.

Best practices for A/B testing

  • Define hypotheses, metrics, and decision rules before launching the test.
  • Keep random assignment clean; avoid overlapping experiments on the same users if possible.
  • Use fixed sample sizes or proper sequential methods instead of ad-hoc peeking.
  • Look at both statistical significance and business impact (uplift × volume).
  • Report uncertainty: p-values together with confidence intervals.

Related experiment & statistics tools

Full original guide (expanded)

This tool uses a normal approximation for the difference in proportions, with a pooled variance estimate. For very small samples, extreme conversion rates, sequential peeking, or complex experiment designs (multi-armed bandits, overlapping tests), consider consulting a statistician or using specialised sequential testing methods.

Sanity checks before shipping a winner

  • Sample sizes are large enough (hundreds or thousands of users per variant).
  • Conversion tracking is working correctly for both A and B.
  • Traffic split is close to the intended allocation (for example 50 / 50).
  • The uplift is meaningful in absolute terms, not only in percentage terms.
  • Your decision rule and test horizon were defined in advance.
Formulas

Conversion rates:

\[\hat{p}_A = \frac{c_A}{n_A}, \quad \hat{p}_B = \frac{c_B}{n_B}\]

Pooled rate:

\[\hat{p} = \frac{c_A + c_B}{n_A + n_B}\]

Standard error:

\[\text{SE}(\hat{p}_B - \hat{p}_A) = \sqrt{\hat{p} (1 - \hat{p}) \left(\frac{1}{n_A} + \frac{1}{n_B}\right)}\]

z-score:

\[z = \frac{\hat{p}_B - \hat{p}_A}{\text{SE}(\hat{p}_B - \hat{p}_A)}\]

Absolute uplift:

\[\Delta = \hat{p}_B - \hat{p}_A\]

Relative uplift:

\[\text{uplift}_\% = 100 \times \frac{\hat{p}_B - \hat{p}_A}{\hat{p}_A}\]
Citations

Sources (authoritative):

Changelog
Changelog
Version: 0.1.0-draft
Last code update: 2026-01-19
0.1.0-draft · 2026-01-19
  • Initial audit spec draft generated from HTML extraction (review required).
  • Verify formulas match the calculator engine and convert any text-only formulas to LaTeX.
  • Confirm sources are authoritative and relevant to the calculator methodology.
Verified by Ugo Candido on 2026-01-19
Last Updated: 2026-01-19
Version 0.1.0-draft