If you think large companies with a massive userbase (Amazon, Google) have an easy time detecting tiny changes in A/B tests, youâre wrong!
The number of users needed to detect a change grows quadratically in the sensitivity, so very large companies cannot detect changes that would lose them $10M/year.
Letâs take a realistic example. Youâre an e-commerce site selling widgets and 5% of users who visit during the experiment period end up purchasing. Those purchasing spend about $25. The average user therefore spends $1.25 (95% spend $0). Assume the standard deviation is $3. If you are running an A/B test and want to detect a 5% change to revenue, you plug this into the power formula:
n=16 sigma^2/delta^2
as 16*3^2/(1.25*0.05)^2=36,864 users.  Reasonable for a small company.
Now youâre Amazon retail, or Google search with more than $100B in annual revenue. Your CFO says that s/he thinks $10M is material enough to know if youâre gaining or losing that much in your experiments (which may be optimizing for something else altogether, so this is a guardrail metric).
Your delta is now 10M/100B = 0.01%. Because itâs squared in the denominator, youâre in trouble.
16*3^2/(1.25*0.0001)^2 > 9.2B users in each variant, a problem when the earthâs population is under 8B.
The largest companies cannot power experiments with enough users to detect a $10M loss.
QED.
To learn more, see my book https://lnkd.in/eWuqBVw and 10-hour interactive Zoom class: https://https://bit.ly/ABClassRKLI .
For other intuition busters in #ABTesting, see our upcoming KDD 2022 paper: https://lnkd.in/gqbtzZDg .
[Edit: this example is now the motivating example for Section 2 in the paper https://lnkd.in/ePigK_cZ: Larsen, N., Stallrich, J., Sengupta, S., Deng, A., Kohavi, R., & Stevens, N. T. (2024). Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology. The American Statistician, 78(2), 135â149. https://lnkd.in/gQ_APj4W]
Lukas Vermeer Alex (Shaojie) Deng
#abtesting #statisticalpower #experimentguide
The number of users needed to detect a change grows quadratically in the sensitivity, so very large companies cannot detect changes that would lose them $10M/year.
Letâs take a realistic example. Youâre an e-commerce site selling widgets and 5% of users who visit during the experiment period end up purchasing. Those purchasing spend about $25. The average user therefore spends $1.25 (95% spend $0). Assume the standard deviation is $3. If you are running an A/B test and want to detect a 5% change to revenue, you plug this into the power formula:
n=16 sigma^2/delta^2
as 16*3^2/(1.25*0.05)^2=36,864 users.  Reasonable for a small company.
Now youâre Amazon retail, or Google search with more than $100B in annual revenue. Your CFO says that s/he thinks $10M is material enough to know if youâre gaining or losing that much in your experiments (which may be optimizing for something else altogether, so this is a guardrail metric).
Your delta is now 10M/100B = 0.01%. Because itâs squared in the denominator, youâre in trouble.
16*3^2/(1.25*0.0001)^2 > 9.2B users in each variant, a problem when the earthâs population is under 8B.
The largest companies cannot power experiments with enough users to detect a $10M loss.
QED.
To learn more, see my book https://lnkd.in/eWuqBVw and 10-hour interactive Zoom class: https://https://bit.ly/ABClassRKLI .
For other intuition busters in #ABTesting, see our upcoming KDD 2022 paper: https://lnkd.in/gqbtzZDg .
[Edit: this example is now the motivating example for Section 2 in the paper https://lnkd.in/ePigK_cZ: Larsen, N., Stallrich, J., Sengupta, S., Deng, A., Kohavi, R., & Stevens, N. T. (2024). Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology. The American Statistician, 78(2), 135â149. https://lnkd.in/gQ_APj4W]
Lukas Vermeer Alex (Shaojie) Deng
#abtesting #statisticalpower #experimentguide