Читать книгу Probability with R - Jane M. Horgan - Страница 45

2.4 Programming in R

Оглавление

One of the great benefits of R is that it is possible to write your own programs and use them as functions in your analysis. Programming is extremely simple in R because of the way it handles vectors and data frames. To illustrate, let us write a program to calculate the mean of . The formula for the mean of a variable with values is given by


In standard programming languages, implementing this formula would necessitate initialization and loops, but with R, statistical calculations such as these are much easier to implement. For example,

sum(downtime)

gives

576

which is the sum of the elements in

length(downtime)

gives

23

gives the number of elements in .

To calculate the mean, write

meandown <- sum(downtime)/length(downtime) meandown [1] 25.04348

Let us also look at how to calculate the standard deviation of the data in .

The formula for the standard deviation of data points stored in an vector is


We illustrate step by step how this is calculated for .

First, subtract the mean from each data point.

downtime - meandown [1] -25.04347826 -24.04347826 -23.04347826 -13.04347826 -13.04347826 [6] -11.04347826 -7.04347826 -4.04347826 -4.04347826 -2.04347826 [11] -1.04347826 -0.04347826 3.95652174 2.95652174 4.95652174 [16] 4.95652174 4.95652174 7.95652174 10.95652174 18.95652174 [21] 19.95652174 21.95652174 25.95652174

Then, obtain the squares of these differences.

(downtime - meandown)^2 [1] 6.271758e+02 5.780888e+02 5.310019e+02 1.701323e+02 1.701323e+02 [6] 1.219584e+02 4.961059e+01 1.634972e+01 1.634972e+01 4.175803e+00 [11] 1.088847e+00 1.890359e-03 1.565406e+01 8.741021e+00 2.456711e+01 [16] 2.456711e+01 2.456711e+01 6.330624e+01 1.200454e+02 3.593497e+02 [21] 3.982628e+02 4.820888e+02 6.737410e+02

Sum the squared differences.

sum((downtime - meandown)^2) [1] 4480.957

Finally, divide this sum by length(downtime)‐1 and take the square root.

sqrt(sum((downtime -meandown)^2)/(length(downtime)-1)) [1] 14.27164

You will recall that R has built‐in functions to calculate the most commonly used statistical measures. You will also recall that the mean and the standard deviation can be obtained directly with

mean(downtime) [1] 25.04348 sd(downtime) [1] 14.27164

We took you through the calculations to illustrate how easy it is to program in R.

Probability with R

Подняться наверх