Читать книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis - Страница 59
2.20.1 t‐Tests for One Sample
ОглавлениеWhen we perform hypothesis testing using the z distribution, we assume we have knowledge of the population variance σ2. Having direct knowledge of σ2 is the most ideal and preferable of circumstances. When we know σ2, we can compute the standard error of the mean directly as
Figure 2.11 Student's t versus normal densities for 3 (left), 10 (middle), and 50 (right) degrees of freedom. As degrees of freedom increase, the limiting form of the t distribution is the z distribution.
Recall that the form of the one‐sample z test for the mean is given by
where the numerator represents the distance between the sample mean and the population mean μ0 under the null hypothesis, and the denominator is the standard error of the mean.
In most research contexts, from simple to complex, we usually do not have direct knowledge of σ2. When we do not have knowledge of it, we use the next best thing, an estimate of it. We can obtain an unbiased estimate of σ2 by computing s2 on our sample. When we do so, however, and use s2 in place of σ2, we can no longer pretend to “know” the standard error of the mean. Rather, we must concede that all we are able to do is estimate it. Our estimate of the standard error of the mean is thus given by:
When we use s2 (where ) in place of σ2, our resulting statistic is no longer a z statistic. That is, we say the ensuing statistic is no longer distributed as a standard normal variable (i.e., z). If it is not distributed as z, then what is it distributed as? Thanks to William Sealy Gosset who in 1908 worked for Guinness Breweries under the pseudonym “Student” (Zabell, 2008), the ratio
was found to be distributed as a t statistic on n − 1 degrees of freedom. Again, the t distribution is most useful for when sample sizes are rather small. For larger samples, as mentioned, the t distribution converges to that of the z distribution. If you are using rather large samples, say approximately 100 or more, whether you evaluate your null hypothesis using a z or t distribution will not matter much, because the critical values for z and t for such degrees of freedom (99 for the one‐sample case) will be relatively alike, that practically at least, the two test statistics can be considered more or less equal. For even larger samples, the convergence is that much more fine‐tuned.
The concept of convergence between z and t can be easily illustrated by inspecting the variance of the t distribution. Unlike the z distribution where the variance is set at 1.0 as a constant, the variance of the t distribution is defined as:
where v are the degrees of freedom. For small degrees of freedom, such as v = 5, the variance of the t distribution is equal to:
Note what happens as v increases, the ratio gets closer and closer to 1.0, which is the precise variance of the z distribution. For example, v = 20 yields:
which is already quite close to the variance of a standardized normal variable z (i.e., 1.0). Hence, we can say more formally
That is, as v increases without bound, the variance of the t distribution equals that of the z distribution, which is equal to 1.0.
We demonstrate the use of the one‐sample t‐test using SPSS. Consider the following small, hypothetical data on IQ scores on five individuals:
IQ 105 98 110 105 95
Suppose that the hypothesized mean IQ in the population is equal to 100. The question we want to ask is—Is it reasonable to assume that our sampled data could have arisen from a population with mean IQ equal to 100? We assume we have no knowledge of the population standard deviation, and hence must estimate it from our sample data. To perform the one‐sample t‐test in SPSS, we compute:
T-TEST /TESTVAL=100 /MISSING=ANALYSIS /VARIABLES=IQ /CRITERIA=CI(.95).
The line /TESTVAL = 100
inputs the test value for our hypothesis test, which for our null hypothesis is equal to 100. We have also requested a 95% confidence interval for the mean difference.
One‐Sample Statistics | ||||
N | Mean | SD | SE Mean | |
IQ | 5 | 102.6000 | 6.02495 | 2.69444 |
We confirm from the above that the size of our sample is equal to 5, and the mean IQ for our sample is equal to 102.60 with standard deviation 6.02. The standard error of the mean reported by SPSS of 2.69 is actually not the true standard error of the mean. It is the estimated standard error of the mean, since recall that we did not have knowledge of the population variance (otherwise we would have been performing a z‐test instead of a t‐test).
One‐Sample Test | ||||||
---|---|---|---|---|---|---|
Test Value = 100 | ||||||
95% Confidence Interval of the Difference | ||||||
t | Df | Sig. (2‐tailed) | Mean Difference | Lower | Upper | |
IQ | 0.965 | 4 | 0.389 | 2.60000 | −4.8810 | 10.0810 |
We note from the above output:
Our obtained t‐statistic is equal to 0.965 and is evaluated on four degrees of freedom (i.e., n − 1 = 5 − 1 = 4). We lose a degree of freedom because recall that in estimating the population variance σ2 with s2, we had to compute a sample mean and hence this value is regarded as “fixed” as we carry on with our t‐test. Hence, we lose a single degree of freedom.
The two‐tailed p‐value is equal to 0.389, which, assuming we had set our criteria for rejection at α = 0.05, leads us to the decision to not reject the null hypothesis. The two‐tailed (as opposed to one‐tailed or directional) nature of the statistical test in this example means that we allow for a rejection of the null hypothesis in either direction from the value stated under the null. Since our null hypothesis is μ0 = 100, it means we were prepared to reject the null hypothesis for observed values of the sample mean that deviate “significantly” either greater than or less than 100. Since our significance level was set at 0.05, this means that we have 0.05/2 = 0.025 area in each end of the t distribution to specify as our rejection region for the test. The question we are asking of our sample mean is—What is the probability of observing a sample mean that falls much greater OR much less than 100? Because the observed sample mean can only fall in one tail or the other on any single trial (i.e., we are conducting a single “trial” when we run this experiment a single time), this implies these two events are mutually exclusive, which by the addition rule for mutually exclusive events, we can add them. When we add their probabilities, we get 0.025 + 0.025 = 0.05, which, of course, is our significance level for the test.
The actual mean difference observed is equal to 2.60, which was computed by taking the mean of our sample, that of 102.6 and subtracting the mean hypothesized under the null hypothesis, that of 100 (i.e., 102.6 – 100 = 2.60).
The 95% confidence interval of the difference is interpreted to mean that with 95% confidence, the interval with lower bound −4.8810 and upper bound 10.0810 will capture the true parameter, which in this case is the population mean difference. We can see that 0 lies within the limits of the confidence interval, which again confirms why we were unable to reject the null hypothesis at the 0.05 level of significance. Had zero lay outside of the confidence interval limits, this would have been grounds to reject the null at a significance level of 0.05 (and consequently, we would have also obtained a p‐value of less than 0.05 for our significance test). Recall that the true mean (i.e., parameter) is not the random component. Rather, the sample is the random component, on which the interval is then computed. It is important to emphasize this distinction when interpreting the confidence interval.
We can easily generate the same t‐test in R. We first generate the vector of data then carry on with the one‐sample t‐test, which we notice mirrors the findings obtained in SPSS:
> iq <- c(105, 98, 110, 105, 95) > t.test(iq, mu = 100) One Sample t-test data: iq t = 0.965, df = 4, p-value = 0.3892 alternative hypothesis: true mean is not equal to 100 95 percent confidence interval: 95.11904 110.08096 sample estimates: mean of x 102.6