Читать книгу Exploratory Factor Analysis - W. Holmes Finch - Страница 12
The Importance of Theory in Doing Factor Analysis
ОглавлениеAs we discussed in the previous section, latent variables are not directly observable, and we only learn about them indirectly through their impact on observed indicator variables. This is a very important concept for us to keep in mind as we move forward in this book, and with factor analysis more generally. How can we know that performance or scores on the observed variables are in fact caused by the latent variable of interest? The short answer is that we cannot know for sure. Indeed, we cannot know that the latent variable does in fact exist. Is depression a concrete, real disease? Is extraversion an actual personality trait? Is there such a thing as reading aptitude? The answer to these questions is we don’t know for sure. How then can we make statements about an individual suffering from depression, or that Juan is a good reader, or that Yi is an extravert? We can make such statements because we have developed a theoretical model that explains how our observed scores should be linked to these latent variables. For example, psychologists have taken prior empirical research as well as existing theories about mood to construct a theoretical explanation for a set of behaviors that connote the presence (or absence) of depression. These symptoms might include sleep disturbance (trouble sleeping or sleeping too much), a lack of interest in formerly pleasurable activities, and contemplation of suicide. Alone, these are simply behaviors that could be derived from a variety of sources unique to each. Perhaps an individual has trouble sleeping because he is excited about a coming job change. However, if there is a theoretical basis for linking all of these behaviors together through some common cause (depression), then we can use observed responses on a questionnaire asking about them to make inferences about the latent variable. Similarly, political scientists have developed conceptual models of political outlook to characterize how people view the world. Some people have views that are characterized as being conservative, others have liberal views, and still others fall somewhere in between the two. This notion of political viewpoint is based on a theoretical model and is believed to drive attitudes that individuals express regarding particular societal and economic issues, which in turn are manifested in responses to items on surveys. However, as with depression, it is not possible to say with absolute certainty that political viewpoint is a true entity. Rather, we can only develop a model and then assess the extent to which observations taken from nature (i.e., responses to survey questions) match with what our theory predicts.
Given this need to provide a rationale for any relationships that we see among observed variables, and that we believe is the result of some unobserved variable, having strong theory is crucial. In short, if we are to make claims about an unobserved variable (or variables) causing observed behaviors, then we need to have some conceptual basis for doing so. Otherwise, the claims about such latent relationships carry no weight. Given that factor analysis is the formalized statistical modeling of these latent variable structures, theory should play an essential role in its use. This means that prior to conducting factor analysis, we should have a theoretical basis for what we expect to find in terms of the number of latent variables (factors), and for how observed indicator variables will be associated with these factors. This does not mean that we cannot use factor analysis in an exploratory way. Indeed, the entire focus of this text is on exploratory factor analysis. However, it does mean that we should have some sense for what the latent variable structure is likely to be. This translates into having a general sense for the number of factors that we are likely to find (e.g., somewhere between two and four), and how the observed variables would be expected to group together (e.g., items 1, 3, 5, and 8 should be measuring a common construct and thus should group together on a common factor). Without such a preexisting theory about the likely factor structure, we will not be able to ascertain when we have an acceptable factor solution and when we do not. Remember, we are using observed data to determine whether predictions from our factor model are accurate. This means that we need to have a sufficiently well-developed factor model so as to make predictions about what the results should look like. For example, what does theory say about the relationship between depression and sleep disturbance? It says that individuals suffering from depression will experience what for them are unusual sleep patterns. Thus, we would expect depressed individuals to indicate that they are indeed suffering from unusual sleep patterns. In short, having a well-constructed theory about the latent structure that we are expecting to find is crucial if we are to conduct the factor analysis properly and make good sense of the results that it provides to us.