Читать книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis - Страница 51

2.13 CONFIDENCE INTERVALS

Оглавление

Recall that a goal of statistical inference is to estimate functions of parameters, whether a single parameter, a difference of parameters (for instance, in the case of population differences), or some other function of parameters. Though the sample mean is an unbiased estimator of μ, the probability that is equal to μ in any given sample, for a continuous measure, converges to zero (Hays, 1994). For this reason, and to build some flexibility in estimation overall, the idea of interval estimation in the form of confidence intervals was developed. Confidence intervals provide a range of values for which we can be relatively certain lay the true parameter we are seeking to estimate. In what follows, we provide a brief review of 95 and 99% confidence intervals.

We can say that over all samples of a given size n, the probability is 0.95 for the following event to occur:

(2.2)

How was (2.2) obtained? Recall the calculation of a z‐score for a mean:


Suppose now that we want to have a 0.025 area on either side of the normal distribution. This value corresponds to a z‐score of 1.96, since the probability of a z‐score of ±1.96 is 2(1 – 0.9750021) = 0.0499958, which is approximately 5% of the total curve. So, from the z‐score, we have


We can modify the equality slightly to get the following:

(2.3)

We interpret (2.3) as follows:

Over all possible samples, the probability is 0.95 that the range between and will include the true mean, μ.

Very important to note regarding the above statement is that μ is not the random variable. The part that is random is the sample on which is computed the interval. That is, the probability statement is not about μ but rather is about samples. The population mean μ is assumed to be fixed. The 95% confidence interval tells us that if we continued to sample repeatedly, and on each sample computed a confidence interval, then 95% of these intervals would include the true parameter.

The 99% confidence interval for the mean is likewise given by:

(2.4)

Notice that the only difference between (2.3) and (2.4) is the choice of different critical values on either side of μ (i.e., 1.96 for the 95% interval and 2.58 for the 99% interval).

Though of course not very useful, a 100% confidence interval, if constructed, would be defined as:


If you think about it carefully, the 100% confidence interval should make perfect sense. If you would like to be 100% “sure” that the interval will cover the true population mean, then you have to extend your limits to negative and positive infinity, otherwise, you could not be fully confident. Likewise, on the other extreme, a 0% interval would simply have as the upper and lower limits:


That is, if you want to have zero confidence in guessing the location of the population mean, μ, then guess the sample mean . Though the sample mean is an unbiased estimator of the population mean, the probability that the sample mean covers the population mean exactly, as mentioned, essentially converges to 0 for a truly continuous distribution (Hays, 1994). As an analogy, imagine coming home and hugging your spouse. If your arms are open infinitely wide (full “bear hug”), you are 100% confident to entrap him or her in your hug because your arms (limits of the interval) extend to positive and negative infinity. If you bring your arms in a little, then it becomes possible to miss him or her with the hug (e.g., 95% interval). However, the precision of the hug is a bit more refined (because your arms are closing inward a bit instead of extending infinitely on both sides). If you approach your spouse with hands together (i.e., point estimate), you are sure to miss him or her, and would have 0% confidence of your interval (hug) entrapping your spouse. An inexact analogy to be sure, but useful in visualizing the concept of confidence intervals.

Applied Univariate, Bivariate, and Multivariate Statistics

Подняться наверх