Читать книгу Biostatistics Decoded - A. Gouveia Oliveira - Страница 27

1.19 The Value of the Standard Error

Оглавление

Let us continue to view, as in Section 1.17, the sample mean as a random variable that results from the sum of identically distributed independent variables. The mean and variance of each of these identical variables are, of course, the same as the population mean and variance, respectively μ and σ2.

When we compute sample means, we sum all observations and divide the result by the sample size. This is exactly the same as if, before we summed all the observations, we divided each one by the sample size. If we represent the sample mean by m, each observation by x, and the sample size by n, what was just said can be represented by


This is the same as if every one of the identical variables was divided by a constant amount equal to the sample size. From the properties of means, we know that if we divide a variable by a constant, its mean will be divided by the same constant. Therefore, the mean of each xi/n is equal to the population mean divided by n, that is, μ/n.

Now, from the properties of means we know that if we add independent variables, the mean of the resulting variable will be the sum of the means of the independent variables. Sample means result from adding together n variables, each one having a mean equal to μ/n. Therefore, the mean of the resulting variable will be n × μ/n = μ, the population mean. The conclusion, therefore, is that the distribution of sample means m has a mean equal to the population mean μ.

A similar reasoning may be used to find the value of the variance of sample means. We saw above that, to obtain a sample mean, we divide every single identical variable x by a constant, the sample size n. Therefore, according to the properties of variances, the variance of each identical variable xi/n will be equal to the population variance σ2 divided by the square of the sample size, that is, σ2/n2. Sample means result from adding together all the x. Consequently, the variance of the sample mean is equal to the sum of the variances of all the observations, that is, equal to n times the population variance divided by the square of the sample size:


This is equivalent to σ2/n, that is, the variance of the sample means is equal to the population variance divided by the sample size. Therefore, the standard deviation of the sample means (the standard error of the mean) equals the population standard deviation divided by the square root of the sample size.

One must not forget that these properties of means and variances only apply in the case of independent variables. Therefore, the results presented above will also only be valid if the sample consists of mutually independent observations. On the other hand, these results have nothing to do with the central limit theorem and, therefore, there are no restrictions related to the normality of the distribution or to the sample size. Actually, whatever the distribution of the attribute and the sample size might be, the mean of the sample means will always be the same as the population mean, and the standard error will always be the same as the population standard deviation divided by the square root of the sample size, provided that the observations are independent. The problem is that, in the case of small samples from an attribute with unknown distribution, we cannot assume that the sample means will have a normal distribution. Therefore, knowledge of the mean and of the standard error will not be sufficient to completely characterize the distribution of sample means.

Biostatistics Decoded

Подняться наверх