We describe how we are creating a new and comprehensive R library solving the problem of exact sample size determination of RCTs. A crucial prerequisite for the trial protocol is a priori sample sizes that bound the test size below a target (often 5%) and the test power above a target (often 80%). Approximate formulas are available for binary trials but the target test size and power are often violated by standard methods for even quite large sample sizes. Moreover, adjusting standard tests to take account of their size bias can reduce power substantially. This has been well known for several decades. Exact and quasi-exact tests are now available and can be computed in a few seconds for a single data set. However, calculating the exact power and size of such tests requires computing them for all possible outcomes. Searching for minimum samples sizes that achieve a given target requires doing this for a wide range of sample sizes. This becomes computationally infeasible very quickly; to compute required sample sizes for a target size of 5% and power of 80% would, on a standard computer, take several months. Computation time increases as the size and clinically relevant difference decreases. After having presented the main operative challenges to creating this library, mainly due to the need of summarizing a very large amount of information, we put forward our innovative solutions to deal with this complex problem from a statistical viewpoint. The described library will be released in open source.
A comprehensive open-source library for exact required sample size in binary clinical trials
Ripamonti E.
2021-01-01
Abstract
We describe how we are creating a new and comprehensive R library solving the problem of exact sample size determination of RCTs. A crucial prerequisite for the trial protocol is a priori sample sizes that bound the test size below a target (often 5%) and the test power above a target (often 80%). Approximate formulas are available for binary trials but the target test size and power are often violated by standard methods for even quite large sample sizes. Moreover, adjusting standard tests to take account of their size bias can reduce power substantially. This has been well known for several decades. Exact and quasi-exact tests are now available and can be computed in a few seconds for a single data set. However, calculating the exact power and size of such tests requires computing them for all possible outcomes. Searching for minimum samples sizes that achieve a given target requires doing this for a wide range of sample sizes. This becomes computationally infeasible very quickly; to compute required sample sizes for a target size of 5% and power of 80% would, on a standard computer, take several months. Computation time increases as the size and clinically relevant difference decreases. After having presented the main operative challenges to creating this library, mainly due to the need of summarizing a very large amount of information, we put forward our innovative solutions to deal with this complex problem from a statistical viewpoint. The described library will be released in open source.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.