Introduction to Linear Regression Analysis

Introduction to Linear Regression Analysis
Автор книги: id книги: 1953857     Оценка: 0.0     Голосов: 0     Отзывы, комментарии: 0 14481,5 руб.     (141,18$) Читать книгу Купить и скачать книгу Купить бумажную книгу Электронная книга Жанр: Математика Правообладатель и/или издательство: John Wiley & Sons Limited Дата добавления в каталог КнигаЛит: ISBN: 9781119578758 Скачать фрагмент в формате   fb2   fb2.zip Возрастное ограничение: 0+ Оглавление Отрывок из книги

Реклама. ООО «ЛитРес», ИНН: 7719571260.

Описание книги

A comprehensive and current introduction to the fundamentals of regression analysis Introduction to Linear Regression Analysis, 6th Edition is the most comprehensive, fulsome, and current examination of the foundations of linear regression analysis. Fully updated in this new sixth edition, the distinguished authors have included new material on generalized regression techniques and new examples to help the reader understand retain the concepts taught in the book. The new edition focuses on four key areas of improvement over the fifth edition: New exercises and data sets New material on generalized regression techniques The inclusion of JMP software in key areas Carefully condensing the text where possible Introduction to Linear Regression Analysis skillfully blends theory and application in both the conventional and less common uses of regression analysis in today's cutting-edge scientific research. The text equips readers to understand the basic principles needed to apply regression model-building techniques in various fields of study, including engineering, management, and the health sciences.

Оглавление

Douglas C. Montgomery. Introduction to Linear Regression Analysis

Table of Contents

List of Illustrations

Guide

Pages

INTRODUCTION TO LINEAR REGRESSION ANALYSIS

PREFACE

CHANGES IN THE SIXTH EDITION

USING THE BOOK AS A TEXT

ACKNOWLEDGMENTS

ABOUT THE COMPANION WEBSITE

CHAPTER 1. INTRODUCTION. 1.1 REGRESSION AND MODEL BUILDING

1.2 DATA COLLECTION

Example 1.1

Retrospective Study

Observational Study

Designed Experiment

1.3 USES OF REGRESSION

1.4 ROLE OF THE COMPUTER

CHAPTER 2. SIMPLE LINEAR REGRESSION. 2.1 SIMPLE LINEAR REGRESSION MODEL

2.2 LEAST-SQUARES ESTIMATION OF THE PARAMETERS

2.2.1 Estimation of β0 and β1

Example 2.1 The Rocket Propellant Data

Computer Output

2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model

2.2.3 Estimation of σ2

Example 2.2 The Rocket Propellant Data

2.2.4 Alternate Form of the Model

2.3 HYPOTHESIS TESTING ON THE SLOPE AND INTERCEPT

2.3.1 Use of t Tests

2.3.2 Testing Significance of Regression

Example 2.3 The Rocket Propellant Data

Minitab Output

2.3.3 Analysis of Variance

Example 2.4 The Rocket Propellant Data

More About the t Test

2.4 INTERVAL ESTIMATION IN SIMPLE LINEAR REGRESSION

2.4.1 Confidence Intervals on β0, β1, and σ2

Example 2.5 The Rocket Propellant Data

2.4.2 Interval Estimation of the Mean Response

Example 2.6 The Rocket Propellant Data

2.5 PREDICTION OF NEW OBSERVATIONS

Example 2.7 The Rocket Propellant Data

2.6 COEFFICIENT OF DETERMINATION

2.7 A SERVICE INDUSTRY APPLICATION OF REGRESSION

2.8 DOES PITCHING WIN BASEBALL GAMES?

2.9 USING SAS® AND R FOR SIMPLE LINEAR REGRESSION

2.10 SOME CONSIDERATIONS IN THE USE OF REGRESSION

2.11 REGRESSION THROUGH THE ORIGIN

Example 2.8 The Shelf-Stocking Data

2.12 ESTIMATION BY MAXIMUM LIKELIHOOD

2.13 CASE WHERE THE REGRESSOR x IS RANDOM

2.13.1 x and y Jointly Distributed

2.13.2 x and y Jointly Normally Distributed: Correlation Model

Example 2.9 The Delivery Time Data

PROBLEMS

CHAPTER 3. MULTIPLE LINEAR REGRESSION

3.1 MULTIPLE REGRESSION MODELS

3.2 ESTIMATION OF THE MODEL PARAMETERS. 3.2.1 Least-Squares Estimation of the Regression Coefficients

Example 3.1 The Delivery Time Data

Computer Output

3.2.2 A Geometrical Interpretation of Least Squares

3.2.3 Properties of the Least-Squares Estimators

3.2.4 Estimation of σ2

Example 3.2 The Delivery Time Data

3.2.5 Inadequacy of Scatter Diagrams in Multiple Regression

3.2.6 Maximum-Likelihood Estimation

3.3 HYPOTHESIS TESTING IN MULTIPLE LINEAR REGRESSION

3.3.1 Test for Significance of Regression

Example 3.3 The Delivery Time Data

Minitab Output

R2and Adjusted R2

3.3.2 Tests on Individual Regression Coefficients and Subsets of Coefficients

Example 3.4 The Delivery Time Data

Minitab Output

Example 3.5 The Delivery Time Data

3.3.3 Special Case of Orthogonal Columns in X

3.3.4 Testing the General Linear Hypothesis

Example 3.6 Testing Equality of Regression Coefficients

Example 3.7

3.4 CONFIDENCE INTERVALS IN MULTIPLE REGRESSION

3.4.1 Confidence Intervals on the Regression Coefficients

Example 3.8 The Delivery Time Data

3.4.2 CI Estimation of the Mean Response

Example 3.9 The Delivery Time Data

3.4.3 Simultaneous Confidence Intervals on Regression Coefficients

Example 3.10 The Rocket Propellant Data

Example 3.11 The Rocket Propellant Data

3.5 PREDICTION OF NEW OBSERVATIONS

Example 3.12 The Delivery Time Data

3.6 A MULTIPLE REGRESSION MODEL FOR THE PATIENT SATISFACTION DATA

3.7 DOES PITCHING AND DEFENSE WIN BASEBALL GAMES?

3.8 USING SAS AND R FOR BASIC MULTIPLE LINEAR REGRESSION

3.9 HIDDEN EXTRAPOLATION IN MULTIPLE REGRESSION

Example 3.13 Hidden Extrapolation—The Delivery Time Data

3.10 STANDARDIZED REGRESSION COEFFICIENTS

Unit Normal Scaling

Unit Length Scaling

Example 3.14 The Delivery Time Data

3.11 MULTICOLLINEARITY

3.12 WHY DO REGRESSION COEFFICIENTS HAVE THE WRONG SIGN?

PROBLEMS

CHAPTER 4. MODEL ADEQUACY CHECKING. 4.1 INTRODUCTION

4.2 RESIDUAL ANALYSIS. 4.2.1 Definition of Residuals

4.2.2 Methods for Scaling Residuals

Standardized Residuals

Studentized Residuals

PRESS Residuals

R-Student

Example 4.1 The Delivery Time Data

4.2.3 Residual Plots

Normal Probability Plot

Example 4.2 The Delivery Time Data

Plot of Residuals against the Fitted Values

Example 4.3 The Delivery Time Data

Plot of Residuals against the Regressor

Example 4.4 The Delivery Time Data

Plot of Residuals in Time Sequence

4.2.4 Partial Regression and Partial Residual Plots

Example 4.5 The Delivery Time Data

Some Comments on Partial Regression Plots

Partial Residual Plots

4.2.5 Using Minitab®, SAS, and R for Residual Analysis

4.2.6 Other Residual Plotting and Analysis Methods

Statistical Tests on Residuals

4.3 PRESS STATISTIC

Example 4.6 The Delivery Time Data

R2for Prediction Based on PRESS

Using PRESS to Compare Models

4.4 DETECTION AND TREATMENT OF OUTLIERS

Example 4.7 The Rocket Propellant Data

4.5 LACK OF FIT OF THE REGRESSION MODEL

4.5.1 A Formal Test for Lack of Fit

Example 4.8 Testing for Lack of Fit

Example 4.9 Testing for Lack of Fit in JMP

4.5.2 Estimation of Pure Error from Near Neighbors

Example 4.10 The Delivery Time Data

PROBLEMS

CHAPTER 5. TRANSFORMATIONS AND WEIGHTING TO CORRECT MODEL INADEQUACIES. 5.1 INTRODUCTION

5.2 VARIANCE-STABILIZING TRANSFORMATIONS

Example 5.1 The Electric Utility Data

5.3 TRANSFORMATIONS TO LINEARIZE THE MODEL

Example 5.2 The Windmill Data

5.4 ANALYTICAL METHODS FOR SELECTING A TRANSFORMATION

5.4.1 Transformations on y: The Box-Cox Method

Computational Procedure

An Approximate Confidence Interval for λ

Example 5.3 The Electric Utility Data

5.4.2 Transformations on the Regressor Variables

Example 5.4 The Windmill Data

5.5 GENERALIZED AND WEIGHTED LEAST SQUARES

5.5.1 Generalized Least Squares

5.5.2 Weighted Least Squares

5.5.3 Some Practical Issues

Example 5.5 Weighted Least Squares

5.6 REGRESSION MODELS WITH RANDOM EFFECTS. 5.6.1 Subsampling

Example 5.6 The Helicopter Subsampling Study

5.6.2 The General Situation for a Regression Model with a Single Random Effect

Example 5.7 The Delivery Time Data Revisited

5.6.3 The Importance of the Mixed Model in Regression

PROBLEMS

CHAPTER 6. DIAGNOSTICS FOR LEVERAGE AND INFLUENCE. 6.1 IMPORTANCE OF DETECTING INFLUENTIAL OBSERVATIONS

6.2 LEVERAGE

Example 6.1 The Delivery Time Data

6.3 MEASURES OF INFLUENCE: COOK’S D

Example 6.2 The Delivery Time Data

6.4 MEASURES OF INFLUENCE: DFFITS AND DFBETAS

A Remark on Cutoff Values

Example 6.3 The Delivery Time Data

6.5 A MEASURE OF MODEL PERFORMANCE

Example 6.4 The Delivery Time Data

6.6 DETECTING GROUPS OF INFLUENTIAL OBSERVATIONS

6.7 TREATMENT OF INFLUENTIAL OBSERVATIONS

PROBLEMS

CHAPTER 7. POLYNOMIAL REGRESSION MODELS. 7.1 INTRODUCTION

7.2 POLYNOMIAL MODELS IN ONE VARIABLE. 7.2.1 Basic Principles

Example 7.1 The Hardwood Data

7.2.2 Piecewise Polynomial Fitting (Splines)

Example 7.2 Voltage Drop Data

Example 7.3 Piecewise Linear Regression

7.2.3 Polynomial and Trigonometric Terms

7.3 NONPARAMETRIC REGRESSION

7.3.1 Kernel Regression

7.3.2 Locally Weighted Regression (Loess)

Example 7.4 Applying Loess Regression to the Windmill Data

7.3.3 Final Cautions

7.4 POLYNOMIAL MODELS IN TWO OR MORE VARIABLES

7.5 ORTHOGONAL POLYNOMIALS

Example 7.5 Orthogonal Polynomials

PROBLEMS

CHAPTER 8. INDICATOR VARIABLES. 8.1 GENERAL CONCEPT OF INDICATOR VARIABLES

Example 8.1 The Tool Life Data

Example 8.2 The Tool Life Data

Example 8.3 An Indicator Variable with More Than Two Levels

Example 8.4 More Than One Indicator Variable

Example 8.5 Comparing Regression Models

a. Parallel Lines

b. Concurrent Lines

c. Coincident Lines

8.2 COMMENTS ON THE USE OF INDICATOR VARIABLES. 8.2.1 Indicator Variables versus Regression on Allocated Codes

8.2.2 Indicator Variables as a Substitute for a Quantitative Regressor

8.3 REGRESSION APPROACH TO ANALYSIS OF VARIANCE

PROBLEMS

CHAPTER 9. MULTICOLLINEARITY. 9.1 INTRODUCTION

9.2 SOURCES OF MULTICOLLINEARITY

9.3 EFFECTS OF MULTICOLLINEARITY

Example 9.1 The Acetylene Data

9.4 MULTICOLLINEARITY DIAGNOSTICS

9.4.1 Examination of the Correlation Matrix

9.4.2 Variance Inflation Factors

9.4.3 Eigensystem Analysis of X′X

9.4.4 Other Diagnostics

9.4.5 SAS and R Code for Generating Multicollinearity Diagnostics

9.5 METHODS FOR DEALING WITH MULTICOLLINEARITY

9.5.1 Collecting Additional Data

9.5.2 Model Respecification

9.5.3 Ridge Regression

Example 9.2 The Acetylene Data

Some Other Properties of Ridge Regression

Relationship to Other Estimators

Methods for Choosing k

Generalized Regression Techniques

9.5.4 Principal-Component Regression

Example 9.3 Principal-Component Regression for the Acetylene Data

9.5.5 Comparison and Evaluation of Biased Estimators

9.6 USING SAS TO PERFORM RIDGE AND PRINCIPAL-COMPONENT REGRESSION

PROBLEMS

CHAPTER 10. VARIABLE SELECTION AND MODEL BUILDING. 10.1 INTRODUCTION. 10.1.1 Model-Building Problem

10.1.2 Consequences of Model Misspecification

10.1.3 Criteria for Evaluating Subset Regression Models

Coefficient of Multiple Determination

Adjusted R2

Residual Mean Square

Mallows’s Cp Statistic

The Akaike Information Criterion and Bayesian Analogues (BICs)

Uses of Regression and Model Evaluation Criteria

10.2 COMPUTATIONAL TECHNIQUES FOR VARIABLE SELECTION

10.2.1 All Possible Regressions

Example 10.1 The Hald Cement Data

Efficient Generation of All Possible Regressions

10.2.2 Stepwise Regression Methods

Forward Selection

Example 10.2 Forward Selection—Hald Cement Data

Backward Elimination

Example 10.3 Backward Elimination—Hald Cement Data

Stepwise Regression

Example 10.4 Stepwise Regression—Hald Cement Data

General Comments on Stepwise-Type Procedures

Stopping Rules for Stepwise Procedures

10.3 STRATEGY FOR VARIABLE SELECTION AND MODEL BUILDING

10.4 CASE STUDY: GORMAN AND TOMAN ASPHALT DATA USING SAS

PROBLEMS

CHAPTER 11. VALIDATION OF REGRESSION MODELS. 11.1 INTRODUCTION

11.2 VALIDATION TECHNIQUES

11.2.1 Analysis of Model Coefficients and Predicted Values

Example 11.1 The Hald Cement Data

11.2.2 Collecting Fresh Data—Confirmation Runs

Example 11.2 The Delivery Time Data

11.2.3 Data Splitting

Example 11.3 The Delivery Time Data

11.3 DATA FROM PLANNED EXPERIMENTS

PROBLEMS

CHAPTER 12. INTRODUCTION TO NONLINEAR REGRESSION

12.1 LINEAR AND NONLINEAR REGRESSION MODELS. 12.1.1 Linear Regression Models

12.1.2 Nonlinear Regression Models

12.2 ORIGINS OF NONLINEAR MODELS

Example 12.1

Example 12.2

12.3 NONLINEAR LEAST SQUARES

Example 12.3 Normal Equations for a Nonlinear Model

Geometry of Linear and Nonlinear Least Squares

Maximum-Likelihood Estimation

12.4 TRANFORMATION TO A LINEAR MODEL

Example 12.4 The Puromycin Data

12.5 PARAMETER ESTIMATION IN A NONLINEAR SYSTEM. 12.5.1 Linearization

Example 12.5 The Puromycin Data

Computer Programs

Estimation of σ2

Graphical Perspective on Linearization

12.5.2 Other Parameter Estimation Methods

Method of Steepest Descent

Fractional Increments

Marquardt’s Compromise

12.5.3 Starting Values

12.6 STATISTICAL INFERENCE IN NONLINEAR REGRESSION

Example 12.6 The Puromycin Data

Validity of Approximate Inference

12.7 EXAMPLES OF NONLINEAR REGRESSION MODELS

12.8 USING SAS AND R

PROBLEMS

CHAPTER 13. GENERALIZED LINEAR MODELS. 13.1 INTRODUCTION

13.2 LOGISTIC REGRESSION MODELS. 13.2.1 Models with a Binary Response Variable

13.2.2 Estimating the Parameters in a Logistic Regression Model

Example 13.1 The Pneumoconiosis Data

13.2.3 Interpretation of the Parameters in a Logistic Regression Model

Example 13.2 The Pneumoconiosis Data

13.2.4 Statistical Inference on Model Parameters

Likelihood Ratio Tests

Testing Goodness of Fit

Testing Hypotheses on Subsets of Parameters Using Deviance

Example 13.3 The Pneumoconiosis Data

Tests on Individual Model Coefficients

Example 13.4 The Pneumoconiosis Data

Confidence Intervals

Example 13.5 The Pneumoconiosis Data

Example 13.6 The Pneumoconiosis Data

Example 13.7 The Pneumoconiosis Data

13.2.5 Diagnostic Checking in Logistic Regression

13.2.6 Other Models for Binary Response Data

13.2.7 More Than Two Categorical Outcomes

13.3 POISSON REGRESSION

Example 13.8 The Aircraft Damage Data

13.4 THE GENERALIZED LINEAR MODEL

13.4.1 Link Functions and Linear Predictors

13.4.2 Parameter Estimation and Inference in the GLM

Example 13.9 The Worsted Yarn Experiment

13.4.3 Prediction and Estimation with the GLM

Example 13.10 The Worsted Yarn Experiment

13.4.4 Residual Analysis in the GLM

Example 13.11 The Worsted Yarn Experiment

13.4.5 Using R to Perform GLM Analysis

13.4.6 Overdispersion

PROBLEMS

CHAPTER 14. REGRESSION ANALYSIS OF TIME SERIES DATA. 14.1 INTRODUCTION TO REGRESSION MODELS FOR TIME SERIES DATA

14.2 DETECTING AUTOCORRELATION: THE DURBIN–WATSON TEST

Example 14.1

14.3 ESTIMATING THE PARAMETERS IN TIME SERIES REGRESSION MODELS

Example 14.2

The Cochrane–Orcutt Method

Example 14.3

The Maximum Likelihood Approach

Example 14.4

Prediction of New Observations and Prediction Intervals

The Case Where the Predictor Variable Must Also Be Forecast

Alternate Forms of the Model

Example 14.5

PROBLEMS

CHAPTER 15. OTHER TOPICS IN THE USE OF REGRESSION ANALYSIS

15.1 ROBUST REGRESSION. 15.1.1 Need for Robust Regression

15.1.2 M-Estimators

Example 15.1 The Stack Loss Data

Computing M-Estimates

15.1.3 Properties of Robust Estimators

Breakdown Point

Efficiency

15.2 EFFECT OF MEASUREMENT ERRORS IN THE REGRESSORS

15.2.1 Simple Linear Regression

15.2.2 The Berkson Model

15.3 INVERSE ESTIMATION—THE CALIBRATION PROBLEM

Example 15.2 Thermocouple Calibration

Other Approaches

15.4 BOOTSTRAPPING IN REGRESSION

15.4.1 Bootstrap Sampling in Regression

15.4.2 Bootstrap Confidence Intervals

Example 15.3 The Delivery Time Data

Example 15.4 The Puromycin date

15.5 CLASSIFICATION AND REGRESSION TREES (CART)

Example 15.5 The Gasoline Mileage Data

15.6 NEURAL NETWORKS

15.7 DESIGNED EXPERIMENTS FOR REGRESSION

PROBLEMS

APPENDIX A. STATISTICAL TABLES

APPENDIX B. DATA SETS FOR EXERCISES

APPENDIX C. SUPPLEMENTAL TECHNICAL MATERIAL

C.1 BACKGROUND ON BASIC TEST STATISTICS

C.1.1 Central Distributions

C.1.2 Noncentral Distributions

C.2 BACKGROUND FROM THE THEORY OF LINEAR MODELS. C.2.1 Basic Definitions

C.2.2 Matrix Derivatives

C.2.3 Expectations

C.2.4 Distribution Theory

C.3 IMPORTANT RESULTS ON SSR AND SSRES. C.3.1 SSR

C.3.2 SSRes

C.3.3 Global or Overall F Test

C.3.4 Extra-Sum-of-Squares Principle

C.3.5 Relationship of the t Test for an Individual Coefficient and the Extra-Sum-of-Squares Principle

C.4 GAUSS–MARKOV THEOREM, VAR(ε) = σ2I

C.5 COMPUTATIONAL ASPECTS OF MULTIPLE REGRESSION

C.6 RESULT ON THE INVERSE OF A MATRIX

C.7 DEVELOPMENT OF THE PRESS STATISTIC

C.8 DEVELOPMENT OF

C.9 OUTLIER TEST BASED ON R-STUDENT

C.10 INDEPENDENCE OF RESIDUALS AND FITTED VALUES

C.11 GAUSS-MARKOV THEOREM, VAR(ε) = V

C.12 BIAS IN MSRES WHEN THE MODEL IS UNDERSPECIFIED

C.13 COMPUTATION OF INFLUENCE DIAGNOSTICS

C.13.1 DFFITSi

C.13.2 Cook’s Di

C.13.3 DFBETASj,i

C.14 GENERALIZED LINEAR MODELS. C.14.1 Parameter Estimation in Logistic Regression

C.14.2 Exponential Family

C.14.3 Parameter Estimation in the Generalized Linear Model

APPENDIX D. INTRODUCTION TO SAS

D.1 BASIC DATA ENTRY

Step 1: Open the SAS Editor Window

Step 2: The Data Command

Step 3: The Input Command

Step 4: Give the Actual Data

Step 5: Using PROC PRINT to Check Data Entry

D.2 CREATING PERMANENT SAS DATA SETS

Step 1: Specify the Directory for the Permanent Data Set

Step 2: Use the Data Statement to Create the Data Set

D.3 IMPORTING DATA FROM AN EXCEL FILE

Step 1: Export the EXCEL Spreadsheet

Step 2: Get the dbf File into UNIX Format

Step 4: When in Doubt, Contact Your System’s Administrator!

D.4 OUTPUT COMMAND

D.5 LOG FILE

D.6 ADDING VARIABLES TO AN EXISTING SAS DATA SET

APPENDIX E. INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS

E.1 BASIC BACKGROUND ON R

E.2 BASIC DATA ENTRY

E.3 BRIEF COMMENTS ON OTHER FUNCTIONALITY IN R

E.4 R COMMANDER

REFERENCES

INDEX

WILEY SERIES IN PROBABILITY AND STATISTICS

WILEY END USER LICENSE AGREEMENT

Отрывок из книги

WILEY SERIES IN PROBABILITY AND STATISTICS

.....

Equation (2.43) points out that the issue of extrapolation is much more subtle; the further the x value is from the center of the data, the more variable our estimate of E(y|x0). Please note, however, that nothing “magical” occurs at the boundary of the x space. It is not reasonable to think that the prediction is wonderful at the observed data value most remote from the center of the data and completely awful just beyond it. Clearly, Eq. (2.43) points out that we should be concerned about prediction quality as we approach the boundary and that as we move beyond this boundary, the prediction may deteriorate rapidly. Furthermore, the farther we move away from the original region of x space, the more likely it is that equation or model error will play a role in the process.

This is not the same thing as saying “never extrapolate.” Engineers and economists routinely use prediction equations to forecast a variable of interest one or more time periods in the future. Strictly speaking, this forecast is an extrapolation. Equation (2.43) supports such use of the prediction equation. However, Eq. (2.43) does not support using the regression model to forecast many periods in the future. Generally, the greater the extrapolation, the higher is the chance of equation error or model error impacting the results.

.....

Добавление нового отзыва

Комментарий Поле, отмеченное звёздочкой  — обязательно к заполнению

Отзывы и комментарии читателей

Нет рецензий. Будьте первым, кто напишет рецензию на книгу Introduction to Linear Regression Analysis
Подняться наверх