Читать книгу Elementary Regression Modeling - Roger A. Wojtkiewicz - Страница 6

Оглавление

Preface

Purpose of the Book

Social scientists use regression analysis extensively to analyze quantitative data, and as a result, there are many books available about regression analysis. My book makes a unique contribution by using a discrete approach to discuss regression modeling. I wrote my book with two key purposes in mind. One was to present an approach to understanding regression analysis that was more straightforward and understandable than the usual approach to explaining regression analysis. Over the course of many years teaching regression analysis to students, I have developed this discrete approach for explaining regression analysis. My discrete approach views the coefficients for the independent variables in a regression equation as capturing group differences on the dependent variable. Therefore, this approach is in contrast to the usual approach to explaining regression analysis that involves continuous independent and dependent variables and estimating a line through a cloud of points. The discrete approach builds on simpler discrete analyses in statistics that use chi-square and t tests. My students have found this approach to be easily understandable.

My second purpose was to provide a source on how to do regression modeling in social science research. I define regression modeling in my book as the use of one or more regression equations to examine a particular coefficient in greater depth. To do that, we need to address research hypotheses. The many books on regression analysis cover well the standard continuous, cloud-of-points approach to explaining regression analysis, technical issues in regard to estimating appropriate regression models given the data at hand, and advanced methods for doing regression analysis. However, what has not been addressed thoroughly is how to use a series of multiple regression equations to examine the contributions of control variables, different approaches to estimating interaction effects, and methods for examining linearity in regression effects.

Many social science graduate programs include a two-semester sequence in statistics with the second semester focusing on regression analysis. In my experience, I have found that students after taking such a sequence understand what regression is, how to use a statistical program to estimate regression coefficients, and what are the technical issues that may cause problems in regressions. However, I also find after this course that many students are still not equipped to answer research questions by using regression analysis. My purpose in writing this book is to provide tools involving control modeling, interaction modeling, and using splines to model linearity that students and researchers can apply to address social science research hypotheses.

Intended Audience

There are two primary audiences for this book. First, this book will be a valuable resource for instructors and students in the second-semester regression course that is taught in many social science graduate programs. I envision this book being a second required text for such a course. The primary required book will be one of the standard texts on regression that covers basic regression analysis, technical issues, and more advanced models. My book would then allow instructors to build on the standard text by adding course content on how to apply regression modeling to address research hypotheses. Using the chapters in my book on control modeling, modeling interactions, using splines to model linearity, and research hypotheses will better prepare students for theses, dissertations, research projects, and contributing to faculty research projects,

The second intended audience for my book is experienced researchers who use regression analysis on an ongoing basis. Although the first three chapters of the book present the discrete approach to understanding regression analysis and deal with the basics of regression analysis, experienced researchers may find this alternative way of thinking about regression analysis informative to their overall understanding of regression. In addition, those who teach undergraduate statistics will find the discrete approach to be a much more accessible way for students to understand regression analysis than the explanations found in most undergraduate statistics books. The chapters on regression modeling concepts, control modeling, modeling interactions, and modeling linearity with splines present more sophisticated ideas. The chapters deal with these issues in an innovative way and in a more in-depth manner than found in other references on regression analysis. I envision my book as the kind that researchers will keep on their bookshelves and often refer to when working on regression modeling issues.

Unique Contributions

My book makes several contributions to understanding regression and regression modeling:

 Discrete approach: A key contribution of the book is the discrete approach to understanding regression analysis. The discrete approach builds on simple differences between groups to explain regression and regression modeling. Although the discrete approach is fully consistent with the common linear, cloud-of-points approach to explaining regression, the discrete approach is much more immediately understandable.

 Means and log odds: The book starts out by looking at means and log odds and then works up to dummy variable regression. This is a simple-to-understand way to approach regression as opposed to the usual abstract starting point that involves estimating a least-squares line through a cloud of points. The book also demystifies logistic regression by showing how the log odds is related to percentages and proportions and how logistic regression with a dummy independent variable just estimates a difference in log odds between two groups.

 Linearity as jumps between groups: The discrete approach starts with dummy variables and then moves to understanding interval or continuous variables. The book shows that a coefficient for an interval variable can be viewed as an equality constraint on the jumps in the mean between groups as one goes from lower groups on the independent variable to higher groups.

 Unit vector, nestedness, higher order differences, constraints: The book explicitly and in detail addresses issues that receive little or no direct coverage in other books on regression analysis. Most books on regression analysis do not give much attention to the unit vector, but as my book shows, understanding the role of the unit vector is important to understanding what the other coefficients mean in regression analysis. In the background of any discussion of regression analysis are the concepts of nestedness, higher order differences, and constraints. My book brings these concepts to the forefront and shows how they apply in different modeling situations.

 Control modeling: Starting with a model with a smaller number of variables and building to larger models with more variables is one of the most, if not the most, commonly used modeling approach in social science research. Surprisingly, other books on regression give little attention to this issue. My book brings to the forefront the underlying issue in control modeling, which is the correlation between independent variables. The book provides alternative modeling approaches for dealing with the problem of how order of entry of independent variables affects the nature of the resulting explanation. The book introduces new terminology for control models: one-at-a-time with no controls, one-at-a-time with controls, step model, and hybrid model. The book also shows how the method of demographic standardization can be used to understand what actually happens when a control variable is added to a model.

 Modeling interactions: Creating interaction variables is a simple task, merely multiply two variables. However, interpreting the meaning of an interaction coefficient in a regression model is a much more difficult task. By limiting the discussion to interactions between dummy variables and interactions between dummy variables and interval variables, the books keeps the explanation at a concrete level and avoids being too abstract. The book introduces the concept of the within-group model. The book shows that for every standard interaction model, there is a within-group interaction model that the standard model addresses. The book gives the researcher the tools to understand more fully what happens when interaction variables are added to a regression model.

 Modeling linearity with splines: Among social scientists, spline variables receive the most attention from economists. My book brings spline variables into the broader discourse in the social sciences and uses spline variables as a way to examine linearity in regression analysis more simply. The common way to model linearity in regression is to introduce a squared variable into the model. A key drawback of this approach is that the coefficient for a squared variable is difficult to interpret and often requires graphing the results. My book suggests using spline variables as an alternative way to examine linearity. The book presents a thorough discussion of two related types of spline models, segment splines and difference splines. The book shows how the results from spline model regressions are immediately interpretable. Thus, the book provides researchers with an alternative and more understandable way to examine linearity than the more commonly used method of squared variables.

 Research hypotheses: The last chapter of the book, while obviously at the end, is still an important chapter. The chapter discusses ways to formulate research hypotheses that are amenable to testing by using regression models. The topic of formulating research hypotheses is a topic that receives little attention in most regression books. Research hypotheses connect the literature review to the regression analysis. The material on research hypotheses gives beginning researchers, in particular, a conceptual tool for understanding how a regression model will relate to theoretical issues.

Special Features

The book provides a discussion of regression analysis that takes a different approach from most other books in the following ways:

 Analytical tables: Throughout the book, every time an issue of data analysis is discussed, the book presents a table that is directly related. The computer output from a regression analysis is just a first step in a regression analysis. Producing a table that presents the results in an understandable way and then discussing that table is the key step in a regression analysis. The book provides many tables, and most of the tables are like those that would be found in research writing. Thus, the book illustrates throughout what is the end goal of a regression analysis, which is a table and a discussion.

 Representing variables as matrices: In the chapters on interaction modeling and control modeling, the book illustrates variables by using a matrix approach. Being able to see how nestedness works in the discussion of within-group and standard interaction models and in the discussion of segment spline and difference spline models is the key to understanding how the various models work. Presenting the variables by using matrices provides the opportunity to visualize the data on which the regression models are based.

 Data and statistical code available to replicate models: Reading about regression modeling takes a researcher part of the way to understanding regression modeling. Actually running regression models takes the researcher the rest of the way. The book provides a link to the High School Longitudinal Study data used throughout the book and provides the statistical program code for creating the exact data file used in the book. The book also provides the statistical program code to create all the variables used in the book. The data file and statistical program code are also available on the SAGE website at http://study.sagepub.com/wojtkiewicz. Providing these materials gives the researcher the opportunity to read about a statistical modeling approach and then to learn how to run those same models to produce correct results.

 Answers to chapter exercises: The chapter exercises ask users of the book to replicate analyses in the book and to create new analyses. The section of the book on answers to chapter exercises presents the results for the new analyses. The opportunity to replicate results and to create new results is the key to fully understanding how to do regression modeling.

So time to get started, good luck, and enjoy your search for results!

Elementary Regression Modeling

Подняться наверх