Small Telescopes for Higher-Power Replications

TL;DR. In Simonsohn’s Small‑Telescopes framework you test whether the true effect is smaller than the effect that would have given the original study 33% power (call this d33%). Simonsohn (2015) shows that this requires 2.5x the original sample size if you want 80% power to show that the true effect is smaller than d33% if the true effect is 0. In this post, I provide alternative multiples for 90% and 95% power (3.5x and 4.5x respectively).

Small-Telescopes in Short: What is being tested (and why “one‑sided”)?

Let d33% be the effect size that would have given the original two‑sample t test 33% power at two‑sided α0=.05. The replication runs a one‑sided test

H0:d=d33%vsH1:d<d33%

at α=.05, and we plan the replication to have (e.g.) 80% power to reject H0 if the true effect is 0. Intuition: Rather than asking whether the effect is exactly zero, the small telescopes test asks whether it is smaller than the minimal effect the original study had a reasonable chance to detect. If the replication can rule out effects of that size, the onus shifts — the original proponents are suggested to now bear the “burden of proof” again, because even the smallest effect they could plausibly have detected is inconsistent with the replication data.

When might higher power be worth it?

The original small telescopes rule-of-thumb — about 2.5× the original sample size for 80% power — works well for many replications. But there are situations where aiming higher is justified. For example:

  • Policy- or practice-relevant findings where acting on a false positive would have high costs.
  • Expensive or rare data where you won’t get a second chance to replicate, so making the most of the opportunity matters.
  • Adversarial collaborations where proponents and critics may want to aim for a definitive answer.

In these cases, the (substantial) cost of increasing power from 80% to 90% or 95% may be outweighed by the clarity it brings. A higher-powered replication yields in a more precise effect size estimate, and is more likely to confirm whether the effect is significantly smaller than the small telescope threshold.

Analytic multipliers

The multipliers for the original per‑cell n to achieve higher power in the small telescopes test can be derived from the normal approximation of the two‑sample t test. Under a normal approximation, the replication‑to‑original per‑cell‑n ratio is

m=(z1α+zpowerz1α0/2+zp0)2,

with p0=1/3 (see SM in Simonsohn, 2015).1 This gives: 80% → 2.64×, 90% → 3.66×, 95% → 4.63×. With finite‑sample t tests, the exact multipliers are slightly lower for small samples and approach these limits as n0 grows. In line with the original paper, which proposed 2.5x for 80% power, I propose using 3.5x and 4.5x for 90% and 95% power respectively.2

Exact calculation and validation of the heuristic multipliers

Using the t-distribution for specific sample sizes, we can calculate more specific multipliers for the small telescopes test. The code below computes these “exact” multiplier for a given original per‑cell n and target power, and compares it to the normal approximation. The results show that the multiples lead to a power within 2% of the target power for n0 up to 1000 participants per cell.

Show code to calculate exact multiplier
library(dplyr)
library(purrr)
library(pwr)

# d_33% for the original two-sample t (per-cell n), p0 = 1/3 per supplement
d33_two_sample_t <- function(n_per_cell, p0 = 1/3, alpha0 = 0.05) {
  pwr.t.test(n = n_per_cell,
             power = p0,
             sig.level = alpha0,
             type = "two.sample",
             alternative = "two.sided")$d
}

# Power to reject H0: d = d33 with a one-sided (less) t-test at alpha, when true d = 0
# Test is one-sided because the replication has a directional hypothesis
power_reject_d33_under_null0 <- function(n_per_cell_rep, d33, alpha = 0.05) {
  df    <- 2 * n_per_cell_rep - 2
  tcrit <- qt(alpha, df = df, lower.tail = TRUE)
  ncp   <- -d33 * sqrt(n_per_cell_rep / 2)
  pt(tcrit, df = df, ncp = ncp, lower.tail = TRUE)
}

# Exact multiplier via t; returns integer per-cell n for replication and achieved power
small_telescopes_multiplier_exact <- function(n_original_per_cell,
                                              target_power = 0.80,
                                              alpha = 0.05,
                                              p0 = 1/3,
                                              alpha0 = 0.05,
                                              max_mult = 100,
                                              tol = 1e-7) {
  stopifnot(n_original_per_cell >= 3)
  d33 <- d33_two_sample_t(n_original_per_cell, p0 = p0, alpha0 = alpha0)
  f <- function(n_rep) power_reject_d33_under_null0(n_rep, d33, alpha) - target_power

  lower <- max(3, n_original_per_cell) * 1.0
  upper <- min(max_mult * n_original_per_cell, 1e7)
  f_lower <- f(lower); f_upper <- f(upper)
  while (f_lower * f_upper > 0 && upper < 1e7) {
    upper <- min(upper * 2, 1e7); f_upper <- f(upper)
  }

  n_rep_cont <- if (f_lower >= 0) lower else if (f_upper <= 0) upper else
    uniroot(f, interval = c(lower, upper), tol = tol)$root

  n_rep_int <- ceiling(n_rep_cont)
  tibble(
    n_original_per_cell = n_original_per_cell,
    d33 = d33,
    n_replication_per_cell_exact = n_rep_int,
    exact_multiplier = n_rep_int / n_original_per_cell,
    achieved_power_exact = power_reject_d33_under_null0(n_rep_int, d33, alpha)
  )
}

# Normal-approx multiplier (independent of n0) for p0 = 1/3
small_telescopes_multiplier_normal <- function(target_power = 0.80,
                                               alpha = 0.05,
                                               p0 = 1/3,
                                               alpha0 = 0.05) {
  z_alpha0_2      <- qnorm(1 - alpha0/2)
  z_p0            <- qnorm(p0)
  z_1_minus_alpha <- qnorm(1 - alpha)
  z_target_power  <- qnorm(target_power)
  mu_orig <- z_alpha0_2 + z_p0
  mu_rep  <- z_1_minus_alpha + z_target_power
  (mu_rep / mu_orig)^2
}
Target power = 80%
Heuristic multiplier: 2.5×
n0 per cell d33% (orig) exact m achieved power (exact) achieved power (heuristic)
10 0.72 2.50 0.81 0.81
20 0.50 2.55 0.80 0.79
50 0.31 2.62 0.80 0.79
100 0.22 2.63 0.80 0.78
500 0.10 2.64 0.80 0.78
1,000 0.07 2.64 0.80 0.78
100,000 0.01 2.65 0.80 0.78
Target power = 90%
Heuristic multiplier: 3.5×
n0 per cell d33% (orig) exact m achieved power (exact) achieved power (heuristic)
10 0.72 3.40 0.90 0.91
20 0.50 3.55 0.90 0.90
50 0.31 3.62 0.90 0.89
100 0.22 3.64 0.90 0.89
500 0.10 3.66 0.90 0.89
1,000 0.07 3.66 0.90 0.89
100,000 0.01 3.67 0.90 0.89
Target power = 95%
Heuristic multiplier: 4.5×
n0 per cell d33% (orig) exact m achieved power (exact) achieved power (heuristic)
10 0.72 4.30 0.95 0.96
20 0.50 4.45 0.95 0.95
50 0.31 4.56 0.95 0.95
100 0.22 4.60 0.95 0.95
500 0.10 4.62 0.95 0.94
1,000 0.07 4.63 0.95 0.94
100,000 0.01 4.63 0.95 0.94

Practical takeaway

  • While the original Simonsohn (2015) heuristic of 2.5× the original per‑cell n for 80% power slightly underestimates the required sample size when n > 20, it achieves at least 78% power even in large samples.
  • If you need 90% or 95% power, 3.5× and 4.5× can serve as comparable heuristics. They also slightly underestimate the sample size, but still achieve 89% and 94% power respectively in large samples.
  • If you want to power your replication study more precisely, you can use the small_telescopes_multiplier_exact() function in the code above.

Reference

Simonsohn, U. (2015). Small telescopes: Detectability and the evaluation of replication results. Psychological Science, 26(5), 559-569.


  1. In prose we say “33% power,” but numerically we follow the Simonsohn (2015) supplement and compute with p0=1/3. Using p0=.33 instead shifts the normal multipliers slightly (e.g., 2.68 vs 2.64).↩︎

  2. All n here are per‑cell for two‑sample t with equal variances. Simonsohn (2015) shows that Χ2 tests behave similarly, and various other tests directly depend on the t-distribution (e.g. correlations, regression coefficients). Nevertheless, the stability of the rules-of-thumb across designs could be investigated further.↩︎

Dr Lukas Wallrich
Dr Lukas Wallrich
Senior Lecturer

Researcher and educator with a focus on Open Science and intergroup relations.