TY - JOUR
T1 - Another Look at the Lady Tasting Tea and Differences Between Permutation Tests and Randomisation Tests
AU - Hemerik, Jesse
AU - Goeman, Jelle J.
PY - 2021/8
Y1 - 2021/8
N2 - The statistical literature is known to be inconsistent in the use of the terms 'permutation test' and 'randomisation test'. Several authors successfully argue that these terms should be used to refer to two distinct classes of tests and that there are major conceptual differences between these classes. The present paper explains an important difference in mathematical reasoning between these classes: a permutation test fundamentally requires that the set of permutations has a group structure, in the algebraic sense; the reasoning behind a randomisation test is not based on such a group structure, and it is possible to use an experimental design that does not correspond to a group. In particular, we can use a randomisation scheme where the number of possible treatment patterns is larger than in standard experimental designs. This leads to exact p values of improved resolution, providing increased power for very small significance levels, at the cost of decreased power for larger significance levels. We discuss applications in randomised trials and elsewhere. Further, we explain that Fisher's famous Lady Tasting Tea experiment, which is commonly referred to as the first permutation test, is in fact a randomisation test. This distinction is important to avoid confusion and invalid tests.
AB - The statistical literature is known to be inconsistent in the use of the terms 'permutation test' and 'randomisation test'. Several authors successfully argue that these terms should be used to refer to two distinct classes of tests and that there are major conceptual differences between these classes. The present paper explains an important difference in mathematical reasoning between these classes: a permutation test fundamentally requires that the set of permutations has a group structure, in the algebraic sense; the reasoning behind a randomisation test is not based on such a group structure, and it is possible to use an experimental design that does not correspond to a group. In particular, we can use a randomisation scheme where the number of possible treatment patterns is larger than in standard experimental designs. This leads to exact p values of improved resolution, providing increased power for very small significance levels, at the cost of decreased power for larger significance levels. We discuss applications in randomised trials and elsewhere. Further, we explain that Fisher's famous Lady Tasting Tea experiment, which is commonly referred to as the first permutation test, is in fact a randomisation test. This distinction is important to avoid confusion and invalid tests.
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=eur_pure&SrcAuth=WosAPI&KeyUT=WOS:000596679800001&DestLinkType=FullRecord&DestApp=WOS_CPL
U2 - 10.1111/insr.12431
DO - 10.1111/insr.12431
M3 - Article
SN - 0306-7734
VL - 89
SP - 367
EP - 381
JO - International Statistical Review
JF - International Statistical Review
IS - 2
ER -