A/B Testing
A randomized experiment comparing two or more versions of a page or feature to determine which performs better on a predefined metric.
A/B testing (also called split testing) is the gold standard for making data-driven product and marketing decisions. By randomly assigning visitors to a control (A) or variation (B), you can isolate the causal impact of a specific change.
How A/B Testing Works
- Define a hypothesis and primary metric
- Create a variation with one specific change
- Randomly split traffic between control and variation
- Run until you reach your pre-calculated sample size
- Analyze results with appropriate statistical methods
- Ship the winner (or learn from the loss)
What Makes A/B Testing Powerful
Unlike analytics, which shows correlation, A/B testing establishes causation. You can say with confidence: "This change caused a 7% lift in conversions" — not just "Conversions went up 7% after we made this change."
Common A/B Testing Mistakes
- No pre-registered hypothesis: Testing random changes without a reason leads to wasted resources
- Peeking at results: Checking daily and stopping early inflates false positive rates
- Testing too many things at once: Can't attribute the effect to any specific change
- Ignoring practical significance: A statistically significant 0.1% lift isn't worth shipping
- Not documenting learnings: The learning from a failed test is often more valuable than the result of a winning test
Beyond Simple A/B Tests
As programs mature, they expand to:
- Multivariate testing: Testing combinations of multiple elements simultaneously
- Multi-armed bandits: Algorithms that dynamically allocate traffic to better-performing variations
- Holdout testing: Measuring the cumulative impact of all shipped changes