Читать книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis - Страница 50
2.12 CENTRAL LIMIT THEOREM
ОглавлениеIt is not an exaggeration to say that the central limit theorem, in one form or another, is probably the most important and relevant theorem in theoretical statistics, which translates to it being quite relevant to applied statistics as well.
We borrow our definition of the central limit theorem from Everitt (2002):
If a random variable y has a population mean μ and population variance σ2, then the sample mean, , based on n observations, has an approximate normal distribution with mean μ and variance , for sufficiently large n. (p. 64)
Asymptotically, the distribution of a normal random variable converges to that of a normal distribution as n → ∞. A multivariate version of the theorem can also be given (e.g., see Rencher, 1998, p. 53).7
The relevance and importance of the central limit theorem cannot be overstated: it allows one to know, at least on a theoretical level, what the distribution of a statistic (e.g., sample mean) will look like for increasing sample size. This is especially important if one is drawing samples from a population for which the shape is not known or is known a priori to be nonnormal. Normality of the sampling distribution, for adequate sample size, is still assured even if samples are drawn from nonnormal populations. Why is this relevant? It is relevant because if we know what the distribution of means will look like for increasing sample size, then we know we can compare our obtained statistic to a normal distribution in order to estimate its probability of occurrence. Normality assumptions are also typically required for assuming independence between and s2 in univariate contexts (Lukacs, 1942), and (mean vector) and S (covariance matrix) in multivariate ones. When such estimators can be assumed to arise from normal or multivariate normal distributions (i.e., in the case of and S) we can generally be assured one is independent of the other.