Читать книгу The Research Experience - Ann Sloan Devlin - Страница 127

Sources of Type I Error and Remedies

Оглавление

Remember that in inferential statistics, we are estimating likelihoods or probabilities that the data represent the true situation, but we set that likelihood at a given level, called the alpha level. By convention, the alpha level is set at .05. What that means is that there are only 5 opportunities in 100 (5 / 100) that we are mistaken in saying that our results are significant when the null hypothesis is true (see also Chapter 2 on this topic).

Standard approaches to reduce the likelihood of a Type I error are as follows: adjusting the alpha level; using a two-tailed versus a one-tailed test; and using a Bonferroni adjustment for multiple analyses (dividing the alpha level by the number of statistical tests you are performing). In theory, you could set your alpha level at a more stringent level (e.g., .01) to avoid a Type I error, but most researchers do not, fearing that a Type II error will occur.

A second approach is using a two-tailed rather than a one-tailed significance test. Please note that the decision to use a one- versus a two-tailed test is made prior to conducting analyses (and is typically indicated in the hypotheses). The difference between a two-tailed significance test and a one-tailed significance test deals with how your alpha is distributed. In a two-tailed test, the alpha level (.05) is divided in two (.025), meaning that each tail of the test statistic contains .025 of your alpha. Thus, the two-tailed test is more stringent than a one-tailed test because the critical region for significance is spread across two tails, not just one. A one-tailed test is not adopted in practice unless your hypothesis is stated as uni-directional rather than as bi-directional. Again, that decision has to be made prior to conducting analyses.

Another way in which a Type I error occurs is the use of multiple statistical tests with the same data. This situation may happen in research because there are not enough participants with specific demographic characteristics to run a single analysis. Here’s an example. Suppose you want to examine issues involving number of majors (e.g., for students who identified themselves as having one major, two majors, or three majors), class year (first vs. fourth year), and athlete status (varsity athlete [A] or not [NA]) and the dependent variables of interest were GPA and career indecision (see Figure 3.6).

What this table shows is that we have 12 cells (look at the bottom row) to fill with participants with the appropriate characteristics. A minimum number of participants per cell might be 15 individuals. But we don’t need just any 15 individuals; each cell must be filled with the 15 who have the required characteristics for that cell. For Cell 1, we need 15 students who identify as having one major, are in their first year, and are varsity athletes. Cell 2 is 15 students who identify as having one major, are in their first year, and are not varsity athletes. You can see the difficulty in filling all of the cells with students who have the sought-after characteristics. A single analysis might not be possible. We might have to ignore the athlete status in one analysis and look at class year and number of majors, which is six cells (against the DVs, GPA, and career indecision). In another analysis, we might look at number of majors (three) and athlete status (two) against the DVs (six cells again). In the full analysis, testing all variables at once, you would have 2 (class year) × 2 (athlete status) × 3 (number of majors) = 12 cells. Thus, if we did only two of those at a time (and given the selected examples) we would have fewer cells than in the full analysis. In a third analysis, we would look at athlete status (two) and class year (two), which has four cells. If we did all of these analyses, we would have run three analyses instead of one. The likelihood that a Type I error would occur has increased because of these multiple tests. For that reason, many researchers recommend using a Bonferroni adjustment, which resets the alpha level (more stringently). To use a Bonferroni adjustment, you divide the conventional alpha level (.05) by the number of tests you have run (here 3) to produce a new alpha level—here .017. Now, to consider a finding significant, the result would have to meet the new (more stringent) alpha level. (For those who want to read in more detail about this issue, articles by Banerjee et al. [2009] and by Bender and Lange [2001] may be helpful.)


Figure 3.6 Example of Using Multiple Statistical Tests With the Same Data

The Research Experience

Подняться наверх