Читать книгу Discovering Partial Least Squares with JMP - Marie Gaudard A. - Страница 5
ОглавлениеContents
Accessing the Supplementary Content
Chapter 1 Introducing Partial Least Squares
Partial Least Squares in Today’s World
Transforming, and Centering and Scaling Data
Chapter 2 A Review of Multiple Linear Regression
Underfitting and Overfitting: A Simulation
The Effect of Correlation among Predictors: A Simulation
Chapter 3 Principal Components Analysis: A Brief Visit
Centering and Scaling: An Example
The Importance of Exploratory Data Analysis in Multivariate Studies
Dimensionality Reduction via PCA
Chapter 4 A Deeper Understanding of PLS
PLS as a Multivariate Technique
An Example Exploring Prediction
One-Factor NIPALS Model
Two-Factor NIPALS Model
Variable Selection
SIMPLS Fits
Choosing the Number of Factors
Cross Validation
Types of Cross Validation
A Simulation of K-Fold Cross Validation
Validation in the PLS Platform
The NIPALS and SIMPLS Algorithms
Useful Things to Remember About PLS
Chapter 5 Predicting Biological Activity
Background
The Data
Data Table Description
Initial Data Visualization
A First PLS Model
Our Plan
Performing the Analysis
The Partial Least Squares Report
The SIMPLS Fit Report
Other Options
A Pruned PLS Model
Model Fit
Diagnostics
Performance on Data from Second Study
Comparing Predicted Values for the Second Study to Actual Values
Comparing Residuals for Both Studies
Obtaining Additional Insight
Conclusion
Chapter 6 Predicting the Octane Rating of Gasoline
Background
The Data
Data Table Description
Creating a Test Set Indicator Column
Viewing the Data
Octane and the Test Set
Creating a Stacked Data Table
Constructing Plots of the Individual Spectra
Individual Spectra
Combined Spectra
A First PLS Model
Excluding the Test Set
Fitting the Model
The Initial Report
A Second PLS Model
Fitting the Model
High-Level Overview
Diagnostics
Score Scatterplot Matrices
Loading Plots
VIPs
Model Assessment Using Test Set
A Pruned Model
Chapter 7 Equation Chapter 1 Section 1Water Quality in the Savannah River Basin
Background
The Data
Data Table Description
Initial Data Visualization
Missing Response Values
Impute Missing Data
Distributions
Transforming AGPT
Differences by Ecoregion
Conclusions from Visual Analysis and Implications
A First PLS Model for the Savannah River Basin
Our Plan
Performing the Analysis
The Partial Least Squares Report
The NIPALS Fit Report
Defining a Pruned Model
A Pruned PLS Model for the Savannah River Basin
Model Fit
Diagnostics
Saving the Prediction Formulas
Comparing Actual Values to Predicted Values for the Test Set
A First PLS Model for the Blue Ridge Ecoregion
Making the Subset
Reviewing the Data
Performing the Analysis
The NIPALS Fit Report
A Pruned PLS Model for the Blue Ridge Ecoregion
Model Fit
Comparing Actual Values to Predicted Values for the Test Set
Conclusion
Chapter 8 Baking Bread That People Like
Background
The Data
Data Table Description
Missing Data Check
The First Stage Model
Visual Exploration of Overall Liking and Consumer Xs
The Plan for the First Stage Model
Stage One PLS Model
Stage One Pruned PLS Model
Stage One MLR Model
Comparing the Stage One Models
Visual Exploration of Ys and Xs
Stage Two PLS Model
Stage Two MLR Model
The Combined Model for Overall Liking
Constructing the Prediction Formula
Viewing the Profiler
Conclusion
Ground Rules
The Singular Value Decomposition of a Matrix
Definition
Relationship to Spectral Decomposition
Other Useful Facts
Principal Components Regression
The Idea behind PLS Algorithms
NIPALS
The NIPALS Algorithm
Computational Results
Properties of the NIPALS Algorithm
SIMPLS
Optimization Criterion
Implications for the Algorithm
The SIMPLS Algorithm
More on VIPs
The Standardize X Option
Determining the Number of Factors
Cross Validation: How JMP Does It
Appendix 2: Simulation Studies
Introduction
The Bias-Variance Tradeoff in PLS
Introduction
Two Simple Examples
Motivation
The Simulation Study
Results and Discussion
Conclusion
Using PLS for Variable Selection
Introduction
Structure of the Study
The Simulation
Computation of Result Measures
Results
Conclusion