Читать книгу The Research Experience - Ann Sloan Devlin - Страница 129
Type II Errors: Sample Size, Power, and Effect Size
ОглавлениеIn a Type II error, we fail to reject the null hypothesis when we should have done so. Often the problem is having too few participants; therefore, having an adequate sample size is the primary way to address this problem. In general, larger sample sizes produce more power (see next section). Power is the ability to evaluate your hypothesis adequately. Formally, power is the probability of rejecting Ho (the null hypothesis), assuming Ho is false.
Power: Probability of rejecting Ho (the null hypothesis), assuming Ho is false.
When a study has sufficient power, you can adequately test whether the null hypothesis should be rejected. Without sufficient power, it may not be worthwhile to conduct a study. If findings are nonsignificant, you won’t be able to tell whether (a) you missed an effect or (b) no effect exists.
There are several reasons why you might not be able to reject the null hypothesis, assuming Ho is false. Your experimental design may be flawed or suffer from other threats to internal validity. Internal validity refers to whether the research design enables you to measure your variables of interest rigorously. All aspects of your research may pose threats to internal validity, such as equipment malfunction, participants who talk during the experiment, or measures with low internal consistency (see Chapters 2 and 5). Low power is another threat to internal validity.
Four factors are generally recognized as impacting the power of the study. In discussing these, David Howell (2013, p. 232) listed (1) the designated alpha level, (2) the true alternative hypothesis (essentially how large the difference between Ho and H1 is), (3) the sample size, and (4) the specific statistical test to be used. In his view, sample size rises to the top as the easiest way to control the power of your study.
Alternative hypothesis: Hypothesis you have stated will be true if the null hypothesis is rejected.
Power is associated with effect size (as defined in Chapter 2; Cohen, 1988), which is discussed next. Effect size describes what its label suggests: whether an intervention of interest has an impact or effect. Consider two means of interest (when Ho is true and when Ho is false) and the sampling distribution of the populations from which they were drawn. What is their overlap? If they are far apart and there is little overlap, you have a large effect size; if they are close together and there is a lot of overlap, you have a small effect size. Effect size is indicated by (d) and represents the difference between means in standard deviation units. Statistical programs generally have an option for providing estimates of both power and effect size, and authors are often asked to include an estimate of effect size in their manuscripts (Howell, 2013).
In the literature, you will see descriptions of effect sizes as small (.20), medium (.50), and large (.80 and above). Jacob Cohen (1988) is usually the source cited. These three sizes represent different degrees of overlap: 85% (small), 67% (medium), and 53% (large). You can see that these percentages of overlap relate to the idea that there is a lot of overlap when the effect size is small and much less when the effect size is large.
Without doing a power calculation, you can still get some sense of the sample size needed in your topic area (with implications for power) by reading the literature. More participants are needed if the effect size reported in your topic area is small. If no effect size is reported for your area of study, you could make a guess about whether you think it is likely to be small, medium, or large. Without information to the contrary, a conservative estimate (i.e., a small effect size) is probably prudent. Cohen (1988) has published tables that indicate how many participants you will need to detect a difference (i.e., to reject Ho, assuming it is false) for a specific effect size. Power is discussed further in Chapter 9, including the use of an online power calculator.