Читать книгу Business Experiments with R - B. D. McCullough - Страница 16

Software Details

Оглавление

Reproduce the above graphs using the data file WorldBankData.csv

Below is code for the first graph. To create the next graph, you will have to create a new variable, the natural logarithm of newspapersper1000.

df <- read.csv("WorldBankData.csv") # "df" is the data frame. plot(df$newspapersper1000,df$lifeexp,xlab="Newspapers per1000", ylab="Life Expectancy",pch=19,cex.axis=1.5,cex.lab=1.15) abline(lm(lifeexp∼newspapersper1000,data=df),lty=2) lines(lowess(df$newspapersper1000,df$lifeexp)) plot(log(df$newspapersper1000),df$lifeexp,xlab="log(Newspapers per1000)",ylab="Life Expectancy",pch=19,cex.axis=1.5, cex.lab=1.15) abline(lm(lifeexp∼log(newspapersper1000),data=df),lty=2) lines(lowess(log(df$newspapersper1000),df$lifeexp))

To analyze these data, we can run a regression of life expectancy (LE) in years against the natural logarithm of the number of newspapers per 1000 persons (LN) for a large number of countries in a given year. The results are

(1.1)

where standard errors are in parentheses, so both the coefficients have very high ‐statistics and are significant. This means that there is a relationship between life expectancy and the number of newspapers per 1000 people. But does this show that a country having more newspapers leads to longer lives for its citizens? Common sense says probably not. The natural logarithm of the number of newspapers is probably a proxy for other variables that drive life expectancy; countries that can afford newspapers can probably also afford better food, housing, and medical services. What we are observing is most likely a mere correlation, and, unfortunately, this sort of observational analysis should not be interpreted as causal.

Business Experiments with R

Подняться наверх