Читать книгу Computational Statistics in Data Science - Группа авторов - Страница 2

Table of Contents

Оглавление

Cover

Title Page

Copyright

List of Contributors

Preface Reference

Part I: Computational Statistics and Data Science 1 Computational Statistics and Data Science in the Twenty‐First Century 1 Introduction 2 Core Challenges 1–3 3 Model‐Specific Advances 4 Core Challenges 4 and 5 5 Rise of Data Science Acknowledgments Notes References 2 Statistical Software 1 User Development Environments 2 Popular Statistical Software 3 Noteworthy Statistical Software and Related Tools 4 Promising and Emerging Statistical Software 5 The Future of Statistical Computing 6 Concluding Remarks Acknowledgments References Further Reading 3 An Introduction to Deep Learning Methods 1 Introduction 2 Machine Learning: An Overview 3 Feedforward Neural Networks 4 Convolutional Neural Networks 5 Autoencoders 6 Recurrent Neural Networks 7 Conclusion References 4 Streaming Data and Data Streams 1 Introduction 2 Data Stream Computing 3 Issues in Data Stream Mining 4 Streaming Data Tools and Technologies 5 Streaming Data Pre‐Processing: Concept and Implementation 6 Streaming Data Algorithms 7 Strategies for Processing Data Streams 8 Best Practices for Managing Data Streams 9 Conclusion and the Way Forward References

Part II: Simulation‐Based Methods 5 Monte Carlo Simulation: Are We There Yet? 1 Introduction 2 Estimation 3 Sampling Distribution 4 Estimating 5 Stopping Rules 6 Workflow 7 Examples References 6 Sequential Monte Carlo: Particle Filters and Beyond 1 Introduction 2 Sequential Importance Sampling and Resampling 3 SMC in Statistical Contexts 4 Selected Recent Developments Acknowledgments Note References 7 Markov Chain Monte Carlo Methods, A Survey with Some Frequent Misunderstandings 1 Introduction 2 Monte Carlo Methods 3 Markov Chain Monte Carlo Methods 4 Approximate Bayesian Computation 5 Further Reading Abbreviations and Acronyms Notes References Note 8 Bayesian Inference with Adaptive Markov Chain Monte Carlo 1 Introduction 2 Random‐Walk Metropolis Algorithm 3 Adaptation of Random‐Walk Metropolis 4 Multimodal Targets with Parallel Tempering 5 Dynamic Models with Particle Filters 6 Discussion Acknowledgments Notes References 9 Advances in Importance Sampling 1 Introduction and Problem Statement 2 Importance Sampling 3 Multiple Importance Sampling (MIS) 4 Adaptive Importance Sampling (AIS) Acknowledgments Notes References

Part III: Statistical Learning 10 Supervised Learning 1 Introduction 2 Penalized Empirical Risk Minimization 3 Linear Regression 4 Classification 5 Extensions for Complex Data 6 Discussion References 11 Unsupervised and Semisupervised Learning 1 Introduction 2 Unsupervised Learning 3 Semisupervised Learning 4 Conclusions Acknowledgment Notes References 12 Random Forests 1 Introduction 2 Random Forest (RF) 3 Random Forest Extensions 4 Random Forests of Interaction Trees (RFIT) 5 Random Forest of Interaction Trees for Observational Studies 6 Discussion References 13 Network Analysis 1 Introduction 2 Gaussian Graphical Models for Mixed Partial Compositional Data 3 Theoretical Properties 4 Graphical Model Selection 5 Analysis of a Microbiome–Metabolomics Data 6 Discussion References 14 Tensors in Modern Statistical Learning 1 Introduction 2 Background 3 Tensor Supervised Learning 4 Tensor Unsupervised Learning 5 Tensor Reinforcement Learning 6 Tensor Deep Learning Acknowledgments References 15 Computational Approaches to Bayesian Additive Regression Trees 1 Introduction 2 Bayesian CART 3 Tree MCMC 4 The BART Model 5 BART Example: Boston Housing Values and Air Pollution 6 BART MCMC 7 BART Extentions 8 Conclusion References

Part IV: High‐Dimensional Data Analysis 16 Penalized Regression 1 Introduction 2 Penalization for Smoothness 3 Penalization for Sparsity 4 Tuning Parameter Selection References 17 Model Selection in High‐Dimensional Regression 1 Model Selection Problem 2 Model Selection in High‐Dimensional Linear Regression 3 Interaction‐Effect Selection for High‐Dimensional Data 4 Model Selection in High‐Dimensional Nonparametric Models 5 Concluding Remarks References 18 Sampling Local Scale Parameters in High-Dimensional Regression Models 1 Introduction 2 A Blocked Gibbs Sampler for the Horseshoe 3 Sampling 4 Sampling 5 Appendix: A. Newton–Raphson Steps for the Inverse‐cdf Sampler for Acknowledgment References Note 19 Factor Modeling for High-Dimensional Time Series 1 Introduction 2 Identifiability 3 Estimation of High‐Dimensional Factor Model 4 Determining the Number of Factors Acknowledgment References

10  Part V: Quantitative Visualization 20 Visual Communication of Data: It Is Not a Programming Problem, It Is Viewer Perception 1 Introduction 2 Case Studies Part 1 3 Let StAR Be Your Guide 4 Case Studies Part 2: Using StAR Principles to Develop Better Graphics 5 Ask Colleagues Their Opinion 6 Case Studies: Part 3 7 Iterate 8 Final Thoughts Notes References 21 Uncertainty Visualization 1 Introduction 2 Uncertainty Visualization Theories 3 General Discussion References 22 Big Data Visualization 1 Introduction 2 Architecture for Big Data Analytics 3 Filtering 4 Aggregating 5 Analyzing 6 Big Data Graphics 7 Conclusion References 23 Visualization‐Assisted Statistical Learning 1 Introduction 2 Better Visualizations with Seriation 3 Visualizing Machine Learning Fits 4 Condvis2 Case Studies 5 Discussion References 24 Functional Data Visualization 1 Introduction 2 Univariate Functional Data Visualization 3 Multivariate Functional Data Visualization 4 Conclusions Acknowledgment References

11  Part VI: Numerical Approximation and Optimization 25 Gradient‐Based Optimizers for Statistics and Machine Learning 1 Introduction 2 Convex Versus Nonconvex Optimization 3 Gradient Descent 4 Proximal Gradient Descent: Handling Nondifferentiable Regularization 5 Stochastic Gradient Descent References 26 Alternating Minimization Algorithms 1 Introduction 2 Coordinate Descent 3 EM as Alternating Minimization 4 Matrix Approximation Algorithms 5 Conclusion References 27 A Gentle Introduction to Alternating Direction Method of Multipliers (ADMM) for Statistical Problems 1 Introduction 2 Two Perfect Examples of ADMM 3 Variable Splitting and Linearized ADMM 4 Multiblock ADMM 5 Nonconvex Problems 6 Stopping Criteria 7 Convergence Results of ADMM Acknowledgments References 28 Nonconvex Optimization via MM Algorithms: Convergence Theory 1 Background 2 Convergence Theorems 3 Paracontraction 4 Bregman Majorization References

12  Part VII: High‐Performance Computing 29 Massive Parallelization 1 Introduction 2 Gaussian Process Regression and Surrogate Modeling 3 Divide‐and‐Conquer GP Regression 4 Empirical Results 5 Conclusion Acknowledgments References 30 Divide‐and‐Conquer Methods for Big Data Analysis 1 Introduction 2 Linear Regression Model 3 Parametric Models 4 Nonparametric and Semiparametric Models 5 Online Sequential Updating 6 Splitting the Number of Covariates 7 Bayesian Divide‐and‐Conquer and Median‐Based Combining 8 Real‐World Applications 9 Discussion Acknowledgment References 31 Bayesian Aggregation 1 From Model Selection to Model Combination 2 From Bayesian Model Averaging to Bayesian Stacking 3 Asymptotic Theories of Stacking 4 Stacking in Practice 5 Discussion References 32 Asynchronous Parallel Computing 1 Introduction 2 Asynchronous Parallel Coordinate Update 3 Asynchronous Parallel Stochastic Approaches 4 Doubly Stochastic Coordinate Optimization with Variance Reduction 5 Concluding Remarks References

13  Index

14  Abbreviations and Acronyms

15  End User License Agreement

Computational Statistics in Data Science

Подняться наверх