Читать книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis - Страница 34
2.2.1 Power for Chi‐Square Test of Independence
ОглавлениеWe can estimate power5and required sample size for the chi‐square test of independence using the package pwr
in R:
> library(pwr) > pwr.chisq.test (w =, N =, df =, sig.level =, power = )
where w is the anticipated or required effect size, estimated as:
and p0i and p1i are the probabilities in a given cell i under the null and alternative hypotheses, respectively. We demonstrate by estimating power for w = 0.2:
> pwr.chisq.test(w = 0.2, N =, df = 5, sig.level = .05, power = 0.90) Chi squared power calculation w = 0.2 N = 411.7366 df = 5 sig.level = 0.05 power = 0.9 NOTE: N is the number of observations
Table 2.2 Contingency Table for 2 × 2 × 2 Design
Exposure | Condition Absent (0) | Condition Present (1) | Total | |
---|---|---|---|---|
Males | Yes | 10 | 20 | 30 |
No | 15 | 5 | 20 | |
Females | Yes | 13 | 17 | 30 |
No | 12 | 8 | 20 | |
Total | 50 | 50 | 100 |
R estimates that a total of approximately 411 subjects are required to achieve power set at 0.90. Such a large sample is required because w = 0.2 constitutes a relatively small effect size (see Cohen (1988) for details).
The reader may ask at this point how one might go about analyzing data for higher‐dimensional frequency tables. The example for the chi‐square test of the data in Table 2.1 is only for that of a 2 × 2 layout. Suppose we added a third factor to our analysis, such as gender, making our contingency table appear as in Table 2.2.
For data such as that in Table 2.2 featuring higher‐dimensional frequency data, log‐linear models are a possibility (Agresti, 2002). Log‐linear models are an option in the wider class of generalized linear models, to be discussed further in Chapter 10, where we discuss in some detail a special case of the generalized linear model called the logistic regression model.