Читать книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis - Страница 47

2.10 SKEWNESS AND KURTOSIS

Оглавление

The third moment of a distribution is its skewness. Skewness of a random variable generally refers to the extent to which a distribution lacks symmetry. Skewness is defined as:


 Skewness for a normal distribution is equal to 0, just as skewness for a rectangular distribution is also equal to 0 (one does not necessarily require a bell‐shaped curve for skewness to equal 0)

 Skewness for a positively skewed distribution is greater than 0; these distributions have tails that stretch out into values on the abscissa of greatest value

 Skewness for a negatively skewed distribution is less than 0; these distributions have tails that stretch out to values on the abscissa of least value

An example of a positively skewed distribution is that of the typical F density, given in Figure 2.10.

The fourth moment of a distribution is its kurtosis, generally referring to the peakness of a distribution (Upton and Cook, 2002), but also having much to do with a distribution's tails (DeCarlo, 1997):



Figure 2.10 F distribution on 2 and 5 degrees of freedom. It is positively skewed since the tail stretches out to numbers of greater value.

With regard to kurtosis, distributions are defined:

 mesokurtic if the distribution exhibits kurtosis typical of a bell‐shaped normal curve

 platykurtic if the distribution exhibits lighter tails and is flatter toward the center than a normal distribution

 leptokurtic if the distribution exhibits heavier tails and is generally more narrow in the center than a normal distribution, revealing that it is somewhat “peaked”

We can easily compute moments of empirical distributions in R or SPSS. Several packages in R are available for this purpose. We could compute skewness for parent on Galton's data by:

> library(psych) > skew(parent) [1] -0.03503614

The psych package (Revelle, 2015) also provides a range of descriptive statistics:

> library(psych) > describe(Galton) vars n mean sd median trimmed mad min max range skew kurtosis parent 1 928 68.31 1.79 68.5 68.32 1.48 64.0 73.0 9 -0.04 0.05 child 2 928 68.09 2.52 68.2 68.12 2.97 61.7 73.7 12 -0.09 -0.35 se parent 0.06 child 0.08

The skew for child has a value of −0.09, indicating a slight negative skew. This is confirmed by visualizing the distribution (and by a relatively close inspection in order to spot the skewness):

> hist(child)

Applied Univariate, Bivariate, and Multivariate Statistics

Подняться наверх