Читать книгу The Big R-Book - Philippe J. S. De Brouwer - Страница 2
Table of Contents
Оглавление1 Cover
2 Foreword
5 Preface
7 PART I: Introduction 1 The Big Picture with Kondratiev and Kardashev Notes 2 The Scientific Method and Data Note 3 Conventions Notes
8 PART II: Starting with R and Elements of Statistics 4 The Basics of R 4.1 Getting Started with R 4.2 Variables 4.3 Data Types 4.4 Operators 4.5 Flow Control Statements 4.6 Functions 4.7 Packages 4.8 Selected Data Interfaces Notes 5 Lexical Scoping and Environments 5.1 Environments in R 5.2 Lexical Scoping in R Note 6 The Implementation of OO 6.1 Base Types 6.2 S3 Objects 6.3 S4 Objects 6.4 The Reference Class, refclass, RC or R5 Model 6.5 Conclusions about the OO Implementation Notes 7 Tidy R with the Tidyverse 7.1 The Philosophy of the Tidyverse 7.2 Packages in the Tidyverse 7.3 Working with the Tidyverse Notes 8 Elements of Descriptive Statistics 8.1 Measures of Central Tendency 8.2 Measures of Variation or Spread 8.3 Measures of Covariation 8.4 Distributions 8.5 Creating an Overview of Data Characteristics Notes 9 Visualisation Methods 9.1 Scatterplots 9.2 Line Graphs 9.3 Pie Charts 9.4 Bar Charts 9.5 Boxplots 9.6 Violin Plots 9.7 Histograms 9.8 Plotting Functions 9.9 Maps and Contour Plots 9.10 Heat-maps 9.11 Text Mining 9.12 Colours in R Notes 10 Time Series Analysis 10.1 Time Series in R 10.2 Forecasting Note 11 Further Reading
9 PART III: Data Import 12 A Short History of Modern Database Systems Notes 13 RDBMS Notes 14 SQL 14.1 Designing the Database 14.2 Building the Database Structure 14.3 Adding Data to the Database 14.4 Querying the Database 14.5 Modifying the Database Structure 14.6 Selected Features of SQL Notes 15 Connecting R to an SQL Database Note
10 PART IV: Data Wrangling 16 Anonymous Data Notes 17 Data Wrangling in the tidyverse 17.1 Importing the Data 17.2 Tidy Data 17.3 Tidying Up Data with tidyr 17.4 SQL-like Functionality via dplyr 17.5 String Manipulation in the tidyverse 17.6 Dates with lubridate 17.7 Factors with Forcats Notes 18 Dealing with Missing Data 18.1 Reasons for Data to be Missing 18.2 Methods to Handle Missing Data 18.3 R Packages to Deal with Missing Data Notes 19 Data Binning 19.1 What is Binning and Why Use It 19.2 Tuning the Binning Procedure 19.3 More Complex Cases: Matrix Binning 19.4 Weight of Evidence and Information Value Notes 20 Factoring Analysis and Principle Components 20.1 Principle Components Analysis (PCA) 20.2 Factor Analysis Note
11 PART V: Modelling 21 Regression Models 21.1 Linear Regression 21.2 Multiple Linear Regression 21.3 Performance of Regression Models 22 Classification Models 22.1 Logistic Regression 22.2 Performance of Binary Classification Models Notes 23 Learning Machines 23.1 Decision Tree 23.2 Random Forest 23.3 Artificial Neural Networks (ANNs) 23.4 Support Vector Machine 23.5 Unsupervised Learning and Clustering Notes 24 Towards a Tidy Modelling Cycle with modelr 24.1 Adding Predictions 24.2 Adding Residuals 24.3 Bootstrapping Data 24.4 Other Functions of modelr 25 Model Validation 25.1 Model Quality Measures 25.2 Predictions and Residuals 25.3 Bootstrapping 25.4 Cross-Validation 25.5 Validation in a Broader Perspective Notes 26 Labs 26.1 Financial Analysis with quantmod Notes 27 Multi Criteria Decision Analysis (MCDA) 27.1 What and Why 27.2 General Work‐flow 27.3 Identify the Issue at Hand: Steps 1 and 2 27.4 Step 3: the Decision Matrix 27.5 Step 4: Delete Inefficient and Unacceptable Alternatives 27.6 Plotting Preference Relationships 27.7 Step 5: MCDA Methods 27.8 Summary MCDA Notes
12 PART VI: Introduction to Companies 28 Financial Accounting (FA) 28.1 The Statements of Accounts 28.2 The Value Chain 28.3 Further, Terminology 28.4 Selected Financial Ratios Notes 29 Management Accounting 29.1 Introduction 29.2 Selected Methods in MA 29.3 Selected Use Cases of MA Notes 30 Asset Valuation Basics 30.1 Time Value of Money 30.2 Cash 30.3 Bonds 30.4 The Capital Asset Pricing Model (CAPM) 30.5 Equities 30.6 Forwards and Futures 30.7 Options Notes
13 PART VII: Reporting 31 A Grammar of Graphics with ggplot2 31.1 The Basics of ggplot2 31.2 Over-plotting 31.3 Case Study for ggplot2 Notes 32 R Markdown Note 33 knitr and LATEX Notes 34 An Automated Development Cycle 35 Writing and Communication Skills Note 36 Interactive Apps 36.1 Shiny 36.2 Browser Born Data Visualization 36.3 Dashboards Notes
14 PART VIII: Bigger and Faster R 37 Parallel Computing 37.1 Combine foreach and doParallel 37.2 Distribute Calculations over LAN with Snow 37.3 Using the GPU Notes 38 R and Big Data 38.1 Use a Powerful Server 38.2 Using more Memory than we have RAM Notes 39 Parallelism for Big Data 39.1 Apache Hadoop 39.2 Apache Spark Notes 40 The Need for Speed 40.1 Benchmarking 40.2 Optimize Code 40.3 Profiling Code 40.4 Optimize Your Computer Notes
15 PART IX: Appendices A Create your own R Package A.1 Creating the Package in the R Console A.2 Update the Package Description A.3 Documenting the Functions A.4 Loading the Package A.5 Further Steps Notes B Levels of Measurement B.1 Nominal Scale B.2 Ordinal Scale B.3 Interval Scale B.4 Ratio Scale C Trademark Notices C.1 General Trademark Notices C.2 R-Related Notices D Code Not Shown in the Body of the Book E Answers to Selected Questions Note
16 Bibliography
17 Nomenclature
18 Index