Determining Sample Size Page 4
Suppose our evaluation of farmers’ adoption of
the new practice only affected 2,000 farmers. The
sample size that would now be necessary is shown in
Equation 4.
As you can see, this adjustment (called the finite
population correction) can substantially reduce the
necessary sample size for small populations.
A Simplified Formula For Proportions
Yamane (1967:886) provides a simplified formula
to calculate sample sizes. This formula was used to
calculate the sample sizes in Tables 2 and 3 and is
shown below. A 95% confidence level andP=.5are
assumed for Equation 5.
Where n is the sample size, N is the population size,
and e is the level of precision. When this formula is
applied to the above sample, we get Equation 6.
Formula For Sample Size For The Mean
The use of tables and formulas to determine
sample size in the above discussion employed
proportions that assume a dichotomous response for
the attributes being measured. There are two
methods to determine sample size for variables that
are polytomous or continuous. One method is to
combine responses into two categories and then use
a sample size based on proportion (Smith, 1983).
The second method is to use the formula for the
sample size for the mean. The formula of the sample
size for the mean is similar to that of the proportion,
except for the measure of variability. The formula for
the mean employs σ
2
instead of (p x q), as shown in
Equation 7.
Where n
0
is the sample size, z is the abscissa of the
normal curve that cuts off an area α at the tails, e is
the desired level of precision (in the same unit of
measure as the variance), and σ
2
is the variance of an
attribute in the population.
The disadvantage of the sample size based on the
mean is that a "good" estimate of the population
variance is necessary. Often, an estimate is not
available. Furthermore, the sample size can vary
widely from one attribute to another because each is
likely to have a different variance. Because of these
problems, the sample size for the proportion is
frequently preferred
2
.
OTHER CONSIDERATIONS
In completing this discussion of determining
sample size, there are three additional issues. First,
the above approaches to determining sample size have
assumed that a simple random sample is the sampling
design. More complex designs, e.g., stratified random
samples, must take into account the variances of
subpopulations, strata, or clusters before an estimate
of the variability in the population as a whole can be
made.
Another consideration with sample size is the
number needed for the data analysis. If descriptive
statistics are to be used, e.g., mean, frequencies, then
nearly any sample size will suffice. On the other
hand, a good size sample, e.g., 200-500, is needed for
multiple regression, analysis of covariance, or log-
linear analysis, which might be performed for more
rigorous state impact evaluations. The sample size
should be appropriate for the analysis that is planned.
In addition, an adjustment in the sample size may
be needed to accommodate a comparative analysis of
subgroups (e.g., such as an evaluation of program
participants with nonparticipants). Sudman (1976)
suggests that a minimum of 100 elements is needed
for each major group or subgroup in the sample and
for each minor subgroup, a sample of 20 to 50
elements is necessary. Similarly, Kish (1965) says that
30 to 200 elements are sufficient when the attribute is
present 20 to 80 percent of the time (i.e., the
distribution approaches normality). On the other
hand, skewed distributions can result in serious
departures from normality even for moderate size
samples (Kish, 1965:17). Then a larger sample or a
census is required.
Finally, the sample size formulas provide the
number of responses that need to be obtained. Many
researchers commonly add 10% to the sample size to
compensate for persons that the researcher is unable