Title stata.com
power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
Description Quick start Menu Syntax
Options Remarks and examples Stored results Methods and formulas
References Also see
Description
power oneproportion, cluster computes the number of clusters, cluster size, power, or target
proportion for a one-sample proportion test in a cluster randomized design (CRD). It computes the
number of clusters given cluster size, power, and the values of the null and alternative proportions.
It also computes cluster size given the number of clusters, power, and the values of the null and
alternative proportions. Alternatively, it computes power given the number of clusters, cluster size,
and the values of the null and alternative proportions, or it computes the target proportion given the
number of clusters, cluster size, power, and the null proportion. See [PSS-2] power oneproportion
for a general discussion of power and sample-size analysis for a one-sample proportion test. Also see
[PSS-2] power for a general introduction to the power command using hypothesis tests.
Quick start
Compute number of clusters for two-sided test of H
0
: π = 0.2 versus H
a
: π 6= 0.2 with null
proportion p
0
= 0.2, alternative proportion p
a
= 0.1, and cluster size of 5, using default intraclass
correlation of 0.5, power of 0.8, and significance level α = 0.05
power oneproportion 0.2 0.1, m(5)
Same as above, but with an intraclass correlation of 0.7
power oneproportion 0.2 0.1, m(5) rho(0.7)
Same as above, but the cluster size varies with a coefficient of variation of 0.6
power oneproportion 0.2 0.1, m(5) rho(0.7) cvcluster(0.6)
Compute cluster size when 52 clusters are sampled:
power oneproportion 0.2 0.1, k(52)
Power for 52 clusters with cluster size of 5
power oneproportion 0.2 0.1, k(52) m(5)
Power for 20, 30, 40, and 50 clusters
power oneproportion 0.2 0.1, k(20(10)50) m(5)
Same as above, but display results in a graph of power versus number of clusters
power oneproportion 0.2 0.1, k(20(10)50) m(5) graph
Effect size and target proportion for p
0
= 0.2 with 40 clusters of size 5, power of 0.9, and α = 0.01,
and default direction upper
power oneproportion 0.2, k(40) m(5) power(0.9) alpha(0.01)
1
2 power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
Menu
Statistics > Power, precision, and sample size
Syntax
Compute number of clusters
power oneproportion p
0
p
a
, {m(numlist) |n(numlist) cluster }
options
Compute cluster size
power oneproportion p
0
p
a
, k(numlist)
options
Compute power
power oneproportion p
0
p
a
, k(numlist) {m(numlist) |n(numlist) }
options
Compute effect size and target proportion
power oneproportion p
0
, k(numlist) {m(numlist) |n(numlist) } power(numlist)
options
where p
0
is the null (hypothesized) proportion or the value of the proportion under the null hypothesis
and p
a
is the alternative (target) proportion or the value of the proportion under the alternative
hypothesis. p
0
and p
a
may each be specified either as one number or as a list of values in
parentheses (see [U] 11.1.8 numlist).
power oneproportion, cluster Power analysis for a one-sample proportion test, CRD 3
options Description
Main
cluster perform computations for a CRD; implied by k() or m()
alpha(numlist) significance level; default is alpha(0.05)
power(numlist) power; default is power(0.8)
beta(numlist) probability of type II error; default is beta(0.2)
k(numlist) number of clusters
m(numlist) cluster size
n(numlist) number of observations
nfractional allow fractional number of clusters, cluster size, and sample size
diff(numlist) difference between the alternative proportion and the null
proportion, p
a
p
0
; specify instead of the
alternative proportion p
a
rho(numlist) intraclass correlation; default is rho(0.5)
cvcluster(numlist) coefficient of variation for cluster sizes
direction(upper|lower) direction of the effect for effect-size determination; default is
direction(upper), which means that the postulated value
of the parameter is larger than the hypothesized value
onesided one-sided test; default is two sided
parallel treat number lists in starred options or in command arguments as
parallel when multiple values per option or argument are
specified (do not enumerate all possible combinations of values)
Table
no
table
(tablespec)
suppress table or display results as a table;
see [PSS-2] power, table
sav
ing(filename
, replace
) save the table data to filename; use replace to overwrite
existing filename
Graph
graph
(graphopts)
graph results; see [PSS-2] power, graph
Iteration
init(#) initial value for number of clusters, cluster size, or proportion
iterate(#) maximum number of iterations; default is iterate(500)
tolerance(#) parameter tolerance; default is tolerance(1e-12)
ftolerance(#) function tolerance; default is ftolerance(1e-12)
no
log suppress or display iteration log
no
dots suppress or display iterations as dots
notitle suppress the title
Specifying a list of values in at least two starred options, or at least two command arguments, or at least one
starred option and one argument results in computations for all possible combinations of the values; see
[U] 11.1.8 numlist. Also see the parallel option.
collect is allowed; see [U] 11.1.10 Prefix commands.
notitle does not appear in the dialog box.
4 power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
where tablespec is
column
:label
column
:label
. . .
, tableopts
column is one of the columns defined below, and label is a column label (may contain quotes and
compound quotes).
column Description Symbol
alpha significance level α
power power 1 β
beta type II error probability β
K number of clusters K
M cluster size M
N number of observations N
delta effect size δ
p0 null proportion p
0
pa alternative proportion p
a
diff difference between the alternative and null p
a
p
0
proportions
rho intraclass correlation ρ
CV cluster coefficient of variation for cluster sizes CV
cl
target target parameter; synonym for pa
all display all supported columns
Column beta is shown in the default table in place of column power if specified.
Columns diff and CV cluster are shown in the default table if specified.
Options
Main
cluster specifies that computations should be performed for a CRD. This option is implied when
either the k() or m() option is specified. It is required if the n() option is used to compute the
number of clusters.
alpha(), power(), beta(); see [PSS-2] power.
k(numlist) specifies the number of clusters. This option is required to compute the cluster size,
power, or effect size.
m(numlist) specifies the cluster size. This option or the n() option is required to compute the
number of clusters, power, or effect size. m() may contain noninteger values. In this case or if
the cvcluster() option is specified, m() represents the average cluster size.
n(numlist) specifies the number of observations. This option or the m() option is required to compute
the number of clusters, power, or effect size.
nfractional; see [PSS-2] power. The nfractional option is allowed when computing the number
of clusters and cluster size to display fractional (without rounding) values of the number of clusters,
cluster size, and sample size.
power oneproportion, cluster Power analysis for a one-sample proportion test, CRD 5
diff(numlist) specifies the difference between the alternative proportion and the null proportion,
p
a
p
0
. You can specify either the alternative proportion p
a
as a command argument or the difference
between the two proportions in diff(). If you specify diff(#), the alternative proportion is
computed as p
a
= p
0
+ #. This option is not allowed with the effect-size determination.
rho(numlist) specifies the intraclass correlation. The default is rho(0.5).
cvcluster(numlist) specifies the coefficient of variation for cluster sizes. This option is used with
varying cluster sizes.
direction(), onesided, parallel; see [PSS-2] power.
Table
table, table(), notable; see [PSS-2] power, table.
saving(); see [PSS-2] power.
Graph
graph, graph(); see [PSS-2] power, graph. Also see the column table for a list of symbols used by
the graphs.
Iteration
init(#) specifies the initial value for the number of clusters or cluster size for sample-size deter-
mination or the initial value for the proportion for the effect-size determination. The default is to
use a closed-form normal approximation to compute an initial value for the estimated parameter.
iterate(), tolerance(), ftolerance(), log, nolog, dots, nodots; see [PSS-2] power.
The following option is available with power oneproportion, cluster but is not shown in the
dialog box:
notitle; see [PSS-2] power.
Remarks and examples stata.com
Remarks are presented under the following headings:
Using power oneproportion, cluster
Computing number of clusters
Computing cluster size
Computing power
Computing effect size and target proportion
Performing hypothesis tests on proportion in a CRD
power oneproportion, cluster requests that computations for the power oneproportion
command be done for a CRD. In a CRD, groups of subjects or clusters are randomized instead of
individual subjects, so the sample size is determined by the number of clusters and the cluster size.
The sample-size determination thus consists of the determination of the number of clusters given
cluster size or the determination of cluster size given the number of clusters. For a general discussion
of using power oneproportion, see [PSS-2] power oneproportion. The discussion below is specific
to the CRD.
6 power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
Using power oneproportion, cluster
If you specify the cluster option, include k() to specify the number of clusters or include m()
to specify the cluster size, the power oneproportion command will perform computations for a
one-sample proportion test in a CRD. The computations for a CRD are based on the large-sample Wald
z test.
All computations are performed for a two-sided hypothesis test where, by default, the significance
level is set to 0.05. You may change the significance level by specifying the alpha() option. You
can specify the onesided option to request a one-sided test.
To compute the number of clusters, you must specify the proportions under the null and alternative
hypotheses as command arguments p
0
and p
a
, respectively, and specify the cluster size in the m()
option. Instead of specifying the m() option, you may specify the sample size in the n() option
and specify the cluster option, so that power onemean will perform its computation for a cluster
randomized design instead of the default individual-level design. You may also specify the power of
the test in the power() option.
To compute cluster size, you must specify the null proportion p
0
, the alternative proportion p
a
,
and the number of clusters in the k() option. You may also specify the power of the test in the
power() option.
To compute power, you must specify the number of clusters in the k() option, the cluster size
in the m() option or the sample size in the n() option, the null proportion p
0
, and the alternative
proportion p
a
.
Instead of the alternative proportion p
a
, you may specify the difference p
a
p
0
between the
alternative proportion and the null proportion in the diff() option when computing sample size or
power.
The effect size δ is defined as the difference between the alternative and null proportions. In a
CRD, the effect size δ is also adjusted for the cluster design; see Methods and formulas.
To compute effect size and the corresponding target proportion, you must specify the number of
clusters in the k() option, the cluster size in the m() option or the sample size in the n() option,
the power in the power() option, and the null proportion p
0
. You may also specify the direction of
the effect in the direction() option. The direction is upper by default, direction(upper); see
Using power oneproportion in [PSS-2] power oneproportion for other details.
All computations assume an intraclass correlation of 0.5. You can change this by specifying the
rho() option. Also, all clusters are assumed to be of the same size unless the coefficient of variation
for cluster sizes is specified in the cvcluster() option.
By default, the computed number of clusters, cluster size, and sample size is rounded up. However,
you can specify the nfractional option to see the corresponding fractional values; see Fractional
sample sizes in [PSS-4] Unbalanced designs for an example. If the cvcluster() option is specified
when computing cluster size, then cluster size represents the average cluster size and is thus not
rounded. When sample size is specified in the n() option, fractional cluster size may be reported to
accommodate the specified number of clusters and sample size.
Some of power oneproportion, clusters computations require iteration, such as to compute
the number of clusters for a two-sided test; see Methods and formulas for details and [PSS-2] power
for the descriptions of options that control the iteration procedure.
power oneproportion, cluster Power analysis for a one-sample proportion test, CRD 7
Computing number of clusters
To compute the number of clusters, you must specify the proportions under the null and alternative
hypotheses as command arguments p
0
and p
a
, respectively, and specify the cluster size in the m()
option. Instead of specifying the m() option, you may specify the sample size in the n() option
and specify the cluster option, so that power onemean will perform its computation for a cluster
randomized design instead of the default individual-level design. You may also specify the power of
the test in the power() option.
Example 1: Number of clusters for a one-sample proportion test in a CRD, specifying
cluster size
Ahn, Heo, and Zhang (2015, 33) demonstrate sample-size computations for a clustered binary
outcome by using the data from Hujoel, Moulton, and Loesche (1990) as pilot data. The data recorded
positive test results from an enzymatic diagnostic test (EDT) of a specific (target) infection. There
were 29 subjects in the study, and each subject had multiple infected sites, as determined by a gold
standard test, which were then retested for the presence of the target infection using the EDT. The
number of infected sites varied among subjects with an average of 4.897 sites, and observations within
a subject were correlated with an intraclass correlation of 0.2. Ahn, Heo, and Zhang (2015) used these
estimates to compute the required number of clusters for a new study to test whether the proportion
of infected sites detected by the EDT is 0.6, H
0
: p = 0.6, against the alternative H
a
: p = 0.7.
We demonstrate how to use power oneproportion, cluster to compute the required number of
clusters.
For simplicity, we assume a constant cluster size across subjects and use an integer cluster size of
5. To detect a proportion of 0.7 against the reference value of 0.6 with 80% power using a 5%-level
two-sided test, we type
. power oneproportion 0.6 0.7, m(5) rho(0.2)
Performing iteration ...
Estimated number of clusters for a one-sample proportion test
Cluster randomized design, Wald z test
H0: p = p0 versus Ha: p != p0
Study parameters:
alpha = 0.0500
power = 0.8000
delta = 0.1000
p0 = 0.6000
pa = 0.7000
Cluster design:
M = 5
rho = 0.2000
Estimated number of clusters and sample size:
K = 60
N = 300
We find that given 5 sites per subject, 60 subjects and thus a total of 300 infected sites are required
to detect a proportion of 0.7 for the infection of interest against a reference proportion of 0.6 with
80% power using a 5%-level two-sided test. The effect size (delta) is calculated as the difference
between the alternative and null proportions.
8 power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
Example 2: Number of clusters for a one-sample proportion test in a CRD, with varying
cluster sizes
Unlike the simplified case in example 1, in a practical study, the number of infected sites per
subject may vary. We use the average number of infected sites of 4.897 and a coefficient of variation
of 0.25. To account for varying cluster sizes, we specify m(4.897) and cvcluster(0.25).
. power oneproportion 0.6 0.7, m(4.897) rho(0.2) cvcluster(0.25)
Performing iteration ...
Estimated number of clusters for a one-sample proportion test
Cluster randomized design, Wald z test
H0: p = p0 versus Ha: p != p0
Study parameters:
alpha = 0.0500
power = 0.8000
delta = 0.1000
p0 = 0.6000
pa = 0.7000
Cluster design:
Average M = 4.8970
rho = 0.2000
CV_cl = 0.2500
Estimated number of clusters and sample size:
K = 61
N = 299
We now need 61 subjects for a total of 299 sites to achieve the same power.
Computing cluster size
To compute cluster size, you must specify the null proportion p
0
, the alternative proportion p
a
,
and the number of clusters in the k() option. You may also specify the power of the test in the
power() option.
Example 3: Cluster size for a one-sample proportion test in a CRD
Continuing with example 1, suppose that we are designing a new study and would like to recruit
80 subjects in the study. We would like to get an idea of how many infected sites we need to achieve
80% power. Given the study parameters from example 1, we compute the number of infected sites
by specifying 80 clusters in the k() option.
power oneproportion, cluster Power analysis for a one-sample proportion test, CRD 9
. power oneproportion 0.6 0.7, k(80) rho(0.2)
Performing iteration ...
Estimated cluster size for a one-sample proportion test
Cluster randomized design, Wald z test
H0: p = p0 versus Ha: p != p0
Study parameters:
alpha = 0.0500
power = 0.8000
delta = 0.1000
p0 = 0.6000
pa = 0.7000
Cluster design:
K = 80
rho = 0.2000
Estimated cluster size and sample size:
M = 3
N = 240
To achieve the desired power with 80 subjects, we will need to observe 3 sites per subject.
Computing power
To compute power, you must specify the number of clusters in the k() option, the cluster size
in the m() option or the sample size in the n() option, the null proportion p
0
, and the alternative
proportion p
a
.
Example 4: Power for a one-sample proportion test in a CRD
Continuing with example 1, suppose that we have 80 subjects and each subject has 5 infected
sites. Given the study parameters from example 1, we compute the power by specifying 80 clusters
in the k() option and cluster size of 5 in the m() option:
. power oneproportion 0.6 0.7, k(80) m(5) rho(0.2)
Estimated power for a one-sample proportion test
Cluster randomized design, Wald z test
H0: p = p0 versus Ha: p != p0
Study parameters:
alpha = 0.0500
delta = 0.1000
p0 = 0.6000
pa = 0.7000
Cluster design:
K = 80
M = 5
N = 400
rho = 0.2000
Estimated power:
power = 0.9020
The computed power is about 90%.
10 power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
Example 5: Multiple values of study parameters
To investigate the effect of the number of clusters on power, we can specify a list of numbers in
the k() option:
. power oneproportion 0.6 0.7, k(20(20)100) m(5) rho(0.2)
Estimated power for a one-sample proportion test
Cluster randomized design, Wald z test
H0: p = p0 versus Ha: p != p0
alpha power K M N delta p0 pa rho
.05 .3696 20 5 100 .1 .6 .7 .2
.05 .6332 40 5 200 .1 .6 .7 .2
.05 .8043 60 5 300 .1 .6 .7 .2
.05 .902 80 5 400 .1 .6 .7 .2
.05 .9532 100 5 500 .1 .6 .7 .2
As expected, as the number of clusters increases, the power tends to get closer to 1.
For multiple values of parameters, the results are automatically displayed in a table, as we see
above. For more examples of tables, see [PSS-2] power, table. If you wish to produce a power plot,
see [PSS-2] power, graph.
Computing effect size and target proportion
The effect size δ is defined as the difference between the alternative and null proportions. In a
CRD, the effect size δ is also adjusted for the cluster design; see Methods and formulas.
To compute effect size and the corresponding target proportion, you must specify the number of
clusters in the k() option, the cluster size in the m() option or the sample size in the n() option,
the power in the power() option, and the null proportion p
0
. You may also specify the direction of
the effect in the direction() option. The direction is upper by default, direction(upper); see
Using power oneproportion in [PSS-2] power oneproportion for other details.
Example 6: Effect size for a one-sample proportion test in a CRD
Continuing with example 4, we may also be interested in finding the minimum value of the
proportion that can be detected with a sample of 80 subjects, 5 infected sites per subject, and 80%
power. To compute this, we specify the null value of 0.6 as the command argument and the required
options k(80), m(5), and power(0.8) and continue to use rho(0.2).
power oneproportion, cluster Power analysis for a one-sample proportion test, CRD 11
. power oneproportion 0.6, k(80) m(5) power(0.8) rho(0.2)
Performing iteration ...
Estimated target proportion for a one-sample proportion test
Cluster randomized design, Wald z test
H0: p = p0 versus Ha: p != p0; pa > p0
Study parameters:
alpha = 0.0500
power = 0.8000
p0 = 0.6000
Cluster design:
K = 80
M = 5
N = 400
rho = 0.2000
Estimated effect size and target proportion:
delta = 0.0871
pa = 0.6871
Given the null value of 0.6, the minimum detectable value of the proportion is about 0.69, which is
slightly smaller than the alternative proportion of 0.7 used in previous examples, because here we use
more subjects than, for instance, in example 1, more sites per subject than in example 3, and lower
power than in example 4.
Performing hypothesis tests on proportion in a CRD
power oneproportion, cluster performs PSS computations based on a large-sample test of
proportion that accounts for a CRD or for clustered data. We can perform this test by using prtest,
cluster(); see [R] prtest. In this section, we briefly demonstrate how to test the hypothesis that
the proportion is different from a reference value on the collected clustered data by using prtest.
Example 7: Testing for proportion with clustered data
Ahn, Heo, and Zhang (2015, 33) report the data from Hujoel, Moulton, and Loesche (1990) on
positive test results from the EDT; see example 1 for details about the study. Let’s use prtest to test
the null hypothesis H
0
: p = 0.6.
For clustered data, prtest requires that we specify the cluster identifier in the cluster() option
and population intraclass correlation in the rho() option. We use the intraclass correlation of 0.2 as
in Ahn, Heo, and Zhang (2015, 33).
12 power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
. use https://www.stata-press.com/data/r18/infection
(Target infections detected by EDT (Hujoel, Moulton, and Loesche 1990))
. prtest infection == 0.6, cluster(subject) rho(0.2)
One-sample test of proportion Number of obs = 142
Cluster variable: subject Number of clusters = 29
Avg. cluster size = 4.90
CV cluster size = 0.2419
Intraclass corr. = 0.2000
Variable Mean Std. err. [95% conf. interval]
infection .6619718 .0537974 .5565308 .7674129
p = proportion(infection) z = 1.1123
H0: p = 0.6
Ha: p < 0.6 Ha: p != 0.6 Ha: p > 0.6
Pr(Z < z) = 0.8670 Pr(|Z| > |z|) = 0.2660 Pr(Z > z) = 0.1330
We do not find any statistical evidence to reject the null hypothesis of H
0
: p = 0.6.
Suppose that we want to design a new similar study and use the estimates from this study to
compute the required number of clusters. We are interested in detecting the alternative value of, say,
0.66 with 80% power for a 5%-level two-sided test. To compute the required number of clusters, we
use the average cluster size of 4.9 as observed in this study.
. power oneproportion 0.6 0.66, m(4.9) rho(0.2)
Performing iteration ...
Estimated number of clusters for a one-sample proportion test
Cluster randomized design, Wald z test
H0: p = p0 versus Ha: p != p0
Study parameters:
alpha = 0.0500
power = 0.8000
delta = 0.0600
p0 = 0.6000
pa = 0.6600
Cluster design:
Average M = 4.9000
rho = 0.2000
Estimated number of clusters and sample size:
K = 178
N = 873
We need 178 subjects to detect the 0.06 difference between the alternative and null proportions, given
the null proportion of 0.6, with 80% power using a 5%-level two-sided test.
power oneproportion, cluster Power analysis for a one-sample proportion test, CRD 13
Stored results
power oneproportion, cluster stores the following in r():
Scalars
r(alpha) significance level
r(power) power
r(beta) probability of a type II error
r(delta) effect size
r(K) number of clusters
r(M) cluster size
r(N) number of subjects
r(nfractional) 1 if nfractional is specified, 0 otherwise
r(onesided) 1 for a one-sided test, 0 otherwise
r(p0) proportion under the null hypothesis
r(pa) proportion under the alternative hypothesis
r(diff) difference between the alternative and null proportions
r(rho) intraclass correlation
r(CV
cluster) coefficient of variation for cluster sizes
r(separator) number of lines between separator lines in the table
r(divider) 1 if divider is requested in the table, 0 otherwise
r(init) initial value for estimated parameter
r(maxiter) maximum number of iterations
r(iter) number of iterations performed
r(tolerance) requested parameter tolerance
r(deltax) final parameter tolerance achieved
r(ftolerance) requested distance of the objective function from zero
r(function) final distance of the objective function from zero
r(converged) 1 if iteration algorithm converged, 0 otherwise
Macros
r(type) test
r(method) oneproportion
r(design) CRD
r(test) wald
r(direction) upper or lower
r(columns) displayed table columns
r(labels) table column labels
r(widths) table column widths
r(formats) table column formats
Matrices
r(pss table) table of results
Methods and formulas
The computation for a CRD is based on the Wald test under the large-sample normal approximation,
adjusted for the cluster design; see Large-sample normal approximation under Methods and formulas
in [PSS-2] power oneproportion for the common notation for a one-sample proportion test.
Methods and formulas are presented under the following headings:
Equal cluster sizes
Unequal cluster sizes
14 power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
Equal cluster sizes
In a CRD, let K be the number of clusters, M be the number of observations in each cluster, and
n be the total number of subjects, where n = MK. Let x
ij
be the outcome of a Bernoulli trial of
the jth (j = 1, 2, . . . , M) observation from the ith cluster (i = 1, 2, . . . , K). Let ρ be the intraclass
correlation and DE be the design effect defined as
DE = 1 + ρ(M 1)
Let P (x
ij
= 1) = p denote the probability of a success in the population. Each individual
observation is a Bernoulli trial with a success probability p. Let
bp =
1
n
K
X
i=1
M
X
j=1
x
ij
and se(bp) =
r
bp(1 bp)DE
n
denote the sample proportion and its standard error, respectively. Let p
0
and p
a
denote the respective
null and alternative values of the proportion parameters.
For a large sample, the distribution of the sample proportion bp may be approximated by the
normal distribution with proportion p and variance p(1 p)DE/n. The Wald test statistic z =
(bp p
0
)/
p
bp(1 bp)DE/n under the null hypothesis follows a standard normal distribution; see, for
example, Ahn, Heo, and Zhang (2015).
Let α be the significance level, β be the probability of a type II error, and z
1α
and z
β
be the
(1 α)th and the βth quantiles of the standard normal distribution. Let
p
std
=
(p
a
p
0
)
p
p
a
(1 p
a
)DE
(1)
The power π = 1 β is computed using
π =
Φ (
np
std
z
1α
) for an upper one-sided test
Φ (
np
std
z
1α
) for a lower one-sided test
Φ
np
std
z
1α/2
+ Φ
np
std
z
1α/2
for a two-sided test
(2)
where Φ(·) is the c.d.f. of the standard normal distribution.
Given the cluster size M , the number of clusters K for a one-sided test is computed by inverting
a one-sided power equation from (2),
K =
z
1α
z
β
p
std
M
2
(3)
Given the sample size n, the number of clusters K for a one-sided test is computed as
K =
n (p
a
p
0
)
2
ρp
a
(1 p
a
) (z
1α
z
β
)
2
1
ρ
+ 1 (4)
Given the number of clusters K, the cluster size M for a one-sided test is computed by solving
(2), after substituting p
std
from (1),
power oneproportion, cluster Power analysis for a one-sample proportion test, CRD 15
M =
1 ρ
K(p
a
p
0
)
2
p
a
(1p
a
)(z
1α
z
β
)
2
ρ
(5)
The number of clusters and cluster size for a two-sided test are computed iteratively using the
two-sided power equation from (2). The initial values are obtained from (3), (4), and (5), with α/2.
The minimum detectable value of the proportion is computed iteratively using the corresponding
power equation from (2).
Unequal cluster sizes
For unequal cluster sizes, we assume that the cluster sizes are independent and identically distributed
and are small relative to the number of clusters; see Ahn, Heo, and Zhang (2015) for details. Let
the coefficient of variation of the cluster sizes be CV
cl
. According to van Breukelen, Candel, and
Berger (2007) and Campbell and Walters (2014), to adjust for varying cluster sizes, define the relative
efficiency (RE) of unequal versus equal cluster sizes as
RE = 1 λ(1 λ)CV
2
cl
where λ = ρM/(ρM + 1 ρ). With unequal cluster sizes, p
std
becomes
p
std
=
(p
a
p
0
)
p
p
a
(1 p
a
)DE/RE
(6)
With p
std
as defined in (6), we can obtain the formula for computing the number of clusters given
cluster size for a one-sided test using (3). In all other cases, parameters are computed iteratively using
the power equations in (2) with p
std
from (6).
References
Ahn, C., M. Heo, and S. Zhang. 2015. Sample Size Calculations for Clustered and Longitudinal Outcomes in Clinical
Research. Boca Raton, FL: CRC Press.
Campbell, M. J., and S. J. Walters. 2014. How to Design, Analyse and Report Cluster Randomised Trials in Medicine
and Health Related Research. Chichester, UK: Wiley.
Gallis, J. A., F. Li, H. Yu, and E. L. Turner. 2018. cvcrand and cptest: Commands for efficient design and analysis
of cluster randomized trials using constrained randomization and permutation tests. Stata Journal 18: 357–378.
Hujoel, P. P., L. H. Moulton, and W. J. Loesche. 1990. Estimation of sensitivity and specificity of site-specific
diagnostic tests. Journal of Periodontal Research 25: 193–196. https://doi.org/10.1111/j.1600-0765.1990.tb00903.x.
van Breukelen, G. J. P., M. J. J. M. Candel, and M. P. F. Berger. 2007. Relative efficiency of unequal ver-
sus equal cluster sizes in cluster randomized and multicentre trials. Statistics in Medicine 26: 2589–2603.
https://doi.org/10.1002/sim.2740.
Also see
[PSS-2] power oneproportion Power analysis for a one-sample proportion test
[PSS-2] power Power and sample-size analysis for hypothesis tests
[PSS-2] power, graph Graph results from the power command
16 power oneproportion, cluster Power analysis for a one-sample proportion test, CRD
[PSS-2] power, table Produce table of results from the power command
[PSS-5] Glossary
[R] prtest Tests of proportions
Stata, Stata Press, and Mata are registered trademarks of StataCorp LLC. Stata and
Stata Press are registered trademarks with the World Intellectual Property Organization
of the United Nations. StataNow and NetCourseNow are trademarks of StataCorp
LLC. Other brand and product names are registered trademarks or trademarks of their
respective companies. Copyright
c
19852023 StataCorp LLC, College Station, TX,
USA. All rights reserved.
®
For suggested citations, see the FAQ on citing Stata documentation.