# Sample Size Calculation

How do we decide on how long a test should run, or in our terms, how many observations do we need per group? This question is relevant because it's normally advised that you decide on a sample size before you start an experiment.  While many A/B testing guides attempt to provide general advice, the reality is that it varies case by case. A common approach for overcoming this problem referred to as the power analysis.&#x20;

## Power Analysis

We perform power analysis to generate needed sample size, and the it includes the following metrics:

1. Effect size (calculated via lift): the minimum size of the effect that we want to detect in a test; for example, a 5% increase in conversion rates.&#x20;

   1. For testing the differences in `means`, after selecting the suitable minimum detectable effect(MDE) of interest, we convert it into a standardized effect size known as `Cohen's d` defined as the difference between the two means divided by the standard deviation:

      Cohen 's d = (µB -µA) / stdev\_pooled
   2. For differences in proportions, a common effect size to use is `Cohen's h` calculated using the  formula:

      Cohen' s h = 2 arcsin (sqrt(p1)) - 2 arcsin (sqrt(p2))

   A general rule of thumb:&#x20;

   * 0.2 corresponds to a small effect,&#x20;
   * 0.5 is a medium effect,&#x20;
   * 0.8 is large.&#x20;

2. Significance Level (predetermined): Alpha value; 5% is typical.

3. Power (predetermined): Probability of detecting an effect

Keep in mind that if we change any of the above metrics, the needed Sample size also changes.

More power, a smaller significance level, or detecting a smaller effect all lead to a larger sample size.&#x20;

```python
"""
The power functions require standardized minimum effect difference. 
To get this, we can use the proportion_effectsize function by inputting our baseline 
and desired minimum conversion rates.
"""
from statsmodels.stats.proportion import proportion_effectsize
from statsmodels.stats.power import zt_ind_solve_power

# calculate standardized minimum effect difference. 
std_effect = proportion_effectsize(0.15, 0.1)

# calculate sample size with alpha=.05, and varying powers
sz1 = zt_ind_solve_power(effect_size=std_effect, nobs1=None, alpha=.05, power=.80)
sz2 = zt_ind_solve_power(effect_size=std_effect, nobs1=None, alpha=.05, power=.90)

print(f"{sz1:.2f}")
print(f"{sz2:.2f}")

"""
680.35
910.80

Note that increasing Power required more samples.
"""
```

## Effect Size, Sample Size and the Power

Below is the Power of Test graph with varying sample and effect sizes:

<figure><img src="/files/u8cNGUUoGwzFEmqZDZNM" alt=""><figcaption></figcaption></figure>

Code to produce the image above:

```python
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.stats.power import TTestIndPower

# Sample Size and Effect Size
sample_sizes = np.array(range(5, 100))
effect_sizes = np.array([0.2, 0.5, 0.8])

# Create results object for t-test analysis
res = TTestIndPower()

# Plot the power analysis
res.plot_power(dep_var='nobs', nobs=sample_sizes, effect_size=effect_sizes)
plt.show()
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://dshub.gitbook.io/ds-hub/statistics/a-b-testing/sample-size-calculation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
