Читать книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis - Страница 69
2.26 MODELS IN MATRIX FORM
ОглавлениеThroughout the book, our general approach is to first present models in their simplest possible form using only scalars. We then gently introduce the reader to the corresponding matrix counterparts and extensions. The requirement of matrices for such models is to accommodate numerous variables and dimensions. Matrix algebra is the vehicle by which multivariate analysis is communicated, though most of the concepts of statistics can be communicated using simpler scalar algebra. Knowing matrix algebra for its own sake will not necessarily equate to understanding statistical concepts. Indeed, hiding behind the mathematics of statistics are the philosophically “sticky” issues that mathematics or statistics cannot, on their own at least, claim to solve. These are often the problems confronted by researchers and scientists in their empirical pursuits and attempts to draw conclusions from data. For instance, what is the nature of a “correct” model? Do latent variables exist, or are they only a consequence of generating linear combinations? The nature of a latent variable is not necessarily contingent on the linear algebra that seeks to define it. Such questions are largely philosophical, and if such interest you, you are strongly encouraged to familiarize yourself with the philosophy of statistics and mathematics (you may not always find answers to your questions, but you will appreciate the complexity of such questions, as they are beyond our current study here). For a gentle introduction to the philosophy of statistics, see Lindley (2001).
As an example of how matrices will be used to develop more complete and general models, consider the multivariate general linear model in matrix form:
where Y is an n x m matrix of n observations on m response variables, X is the model or “design” matrix whose columns contain k regressors which includes the intercept term, B is a matrix of regression coefficients, and E is a matrix of errors. Many statistical models can be incorporated into the framework of (2.7). As a relatively easy application of this general model, consider the simple linear regression model (featured in Chapter 7) in matrix form:
where yi = 1 to yi = n are observed measurements on some dependent variable, X is the model matrix containing a constant of 1 in the first column to represent the common intercept term (i.e., “common” implying there is one intercept that represents all observations in our data), xi = 1 to xi = n are observed values on a predictor variable, α is the fixed intercept parameter, β is the slope parameter, which we also assume to be fixed, and ε is a vector of errors ε1 to εn (we use ε here instead of E).
Suppose now we want to add a second response variable. Because of the generality of (2.7), this can be easily accommodated:
where now, a second response variable is represented in Y by a second column. That is, yi = 1, 2 corresponds to individual 1 on response variable 2, yi = 2, 2 is individual 2 on response variable 2, etc. We will at times refer to matrix representations throughout the book.