Mobile and desktop are different products. CTA tests that ignore device segmentation make ship/revert decisions on aggregates that hide opposite-direction effects on each device class.

TL;DR

  • The same CTA change can produce opposite-direction results on mobile vs desktop. Aggregates blend the two, often producing a noisy "directional" result that doesn't actually exist on either device.
  • In a 200+ test portfolio, single-device-segmented tests (mobile-only or desktop-only) outperform "combined" tests by a meaningful margin. Combined tests had the lowest win rate of any platform category.
  • Some change types are inherently asymmetric: sticky positioning, hero size, modal-mediated routing, form field changes. These should never be read on aggregate alone.
  • The decision matrix below maps eight aggregate × segmented combinations to the right ship/revert action — including the cases where the aggregate is misleading.

Why aggregates lie

Aggregate results combine device classes proportional to their traffic share. If mobile is 70% of traffic, the aggregate weights mobile more heavily — even when the mobile-specific result is wrong for desktop users.

Mobile resultDesktop resultAggregate (70% mobile, 30% desktop)What aggregate tells you
+5%+5%+5%Both devices win
+5%-3%+2.6%"Directional positive" — but desktop is hurt
-2%+8%+1%"Directional positive" — but mobile is hurt
-10%+20%+1%"Directional positive" — but mobile regression is severe

The last row is the warning sign. Aggregate ~+1% reads as a noisy directional win. Segmented by device, mobile is regressing 10% while desktop gains 20%. The right call is "ship on desktop, revert on mobile" — invisible to the aggregate.

Pattern data: segmented vs combined wins

Looking at 200+ tests across two years of an enterprise CRO portfolio, win rates differ meaningfully by how the test was scoped:

Test platformWin rate (range)
Desktop-only~22-26%
Mobile-only~24-28%
Both Desktop AND Mobile (segmented results required)~30-34%
Combined (no device segmentation)<10%

The "Combined" category — tests that did not require segmented results — had the lowest win rate by a wide margin. The mechanism is clear once you've seen it: combined tests bury opposite-direction effects, so even when something genuinely won on one device the aggregate often showed inconclusive.

When device asymmetry is most likely

Some CTA changes are device-agnostic; others have device-specific mechanisms. Anticipate which is which before launching the test.

CTA change typeAsymmetry likelihoodMechanism
Sticky positioningHighMobile viewport real estate is tighter; sticky impact differs
Hero size / above-fold restructureHighMobile viewport changes what's "above the fold"; desktop has more horizontal space
Modal-mediated routingHighMobile modal UX is more disruptive than desktop
Form field reductionHighMobile typing friction higher; desktop users tolerate more fields
CTA copy changeLowCopy semantics travel across devices
Visual hierarchy / colorLowSame visual logic on both
Button placement (within-section)MediumDepends on whether the section's layout differs by device

For high-asymmetry change types, segment by device before reading the aggregate.

The asymmetry signatures

Three patterns recur across device-segmented CTA tests:

PatternMobileDesktopAction
Mobile-friendly onlyPositiveFlat or slight negativeShip on mobile only
Desktop-friendly onlyFlat or slight negativePositiveShip on desktop only
Mobile-hostileStrong negativePositiveRevert on mobile (even if aggregate says ship)
Desktop-hostilePositiveStrong negativeRevert on desktop
UniversalSame directionSame directionShip/revert sitewide

The first four signatures all imply device-conditional shipping. The infrastructure cost of conditional rollout is small relative to the funnel cost of shipping a regression on the wrong device class.

Decision matrix: device-segmented shipping

AggregateMobile segmentDesktop segmentDecision
PositivePositivePositiveShip sitewide
PositivePositiveFlatShip on mobile; hold on desktop
PositiveFlatPositiveShip on desktop; hold on mobile
PositiveNegativePositiveShip on desktop, revert on mobile — aggregate is misleading
FlatPositiveNegativeShip on mobile, revert on desktop
FlatNegativePositiveShip on desktop, revert on mobile
NegativeNegativeNegativeRevert sitewide
NegativePositiveNegativeShip on mobile, revert on desktop — aggregate hides mobile win

The aggregate is informational only. The decision is determined by the per-segment columns.

Worked example: a homepage iteration with strong asymmetry

A homepage hierarchy + offer-placement test produced positive aggregate results, but the device segmentation revealed the win was concentrated on desktop:

Funnel metricAll DevicesDesktopMobile
Page-entry rate+2.4%+7.4%-0.7%
Mid-funnel completion+7.0%+9.4%+5.6%
Downstream conversion+11.8%+23.9%+4.2%

The desktop segment carried most of the lift. Mobile was directionally positive on mid-funnel but negative on the upstream metric — a signature of a layout change that worked better on the desktop viewport. Decision: ship the change but plan a mobile-specific iteration to recover the upstream metric on mobile.

The suspected mobile cause: leading with form input above the hero on mobile rather than letting the message appear first. The desktop variant didn't have that problem because the wider viewport let both elements coexist above the fold.

Pre-test instrumentation requirements

For high-asymmetry change types, the test needs to be set up to read by device from day one:

RequirementWhy
Device class as a primary segmentation dimensionStandard segment, not custom-cut at analysis time
Per-device sample size targetsMobile and desktop power separately; total may be powered while segments are not
Per-device MDE acceptanceOften need larger MDE on the smaller segment
Pre-committed device-conditional shipping planDecide before launch whether asymmetric results would ship on one device only

Without these, a test producing strong asymmetry will be hard to interpret and harder to ship correctly.

When to NOT segment by device

A few contexts where aggregate reading is sufficient:

ContextWhy aggregate is OK
Truly device-agnostic change (copy, color, semantic)Mechanism doesn't differ by viewport
Single-device test (mobile-only or desktop-only)One segment, no asymmetry possible
Test with sample size only powered for the aggregateSegment-level reads will be noise

For most other CTA tests, device segmentation should be a default report column.

Bottom line

Mobile and desktop are different products on the same site. CTA tests routinely produce opposite-direction effects on the two device classes. Aggregates hide this. The portfolio data shows combined-platform tests (no device segmentation) have the lowest win rate by a wide margin — because they bury the wins inside aggregates that look like noise.

Segment every CTA test by device class, especially for high-asymmetry change types (sticky, hero, modal, form-field). Use the per-device segment results — not the aggregate — to make ship/revert decisions. Conditional rollout is cheap to implement and saves the funnel from device-class regressions disguised as flat aggregates.

Share this article
LinkedIn (opens in new tab) X / Twitter (opens in new tab)
Atticus Li

Experimentation and growth leader. CXL-certified CRO practitioner, Mindworx-certified behavioral economist (1 of ~1,000 worldwide). 200+ A/B tests across energy, SaaS, fintech, e-commerce, and marketplace verticals.