Читать книгу Applied Univariate, Bivariate, and Multivariate Statistics - Daniel J. Denis - Страница 82

Review Exercises

Оглавление

1 2.1. Distinguish between a density and an empirical distribution. How are they different? How are they similar?

2 2.2. Consider the univariate normal density: Show that for a standard normal distribution, the above becomes .

3 2.3. Explain the nature of a z‐score, Why is it also called a standardized score?

4 2.4. Using R, compute the probability of observing a standardized score of 1.0 or greater. What is then the probability of observing a score less than 1.0 from such a distribution?

5 2.5. Think up a research example in which the binomial distribution would be useful in evaluating a null hypothesis.

6 2.6. Rafael Nadal, a professional tennis player, as of 2020 had won the French Open tennis championship a total of 13 times in the past 16 tournaments. If we set the probability of him winning each time at 0.5, determine the probability of winning 13 times out of 16. Make a statistical argument that Nadal is an exceptional tennis player at the French Open. What if we set the probability of a win at 0.1? Does this make Nadal's achievements less or more impressive? Why? Explain.

7 2.7. Give an example using the binomial distribution in which the null hypothesis would not be rejected even if observing 9 out of 10 heads on flips of a coin.

8 2.8. On a fair coin, what is the probability of observing 0 heads or 5 heads? How did you arrive at this probability, and which rules of probability did you use in your computation?

9 2.9. Discuss what a limiting form of a distribution means, and how the limiting form of the binomial distribution is that of the normal distribution.

10 2.10. Consider the multivariate density: All else constant, what effect does an increasing value of the determinant (∣∑∣) have on the density, and how does this translate when using real variables?

11 2.11. What is meant by the expectation of a random variable?

12 2.12. Compare these two products, and explain how and why they are different from one another when taking expectations: yip(yi) versus yip(yi)dy

13 2.13. Why is it reasonable that the arithmetic mean is the center of gravity of a distribution?

14 2.14. What is an unbiased estimator of a population mean vector?

15 2.15. Discuss what it means to say that E(S2) ≠ σ2, and the implications of this. What is E(S2) equal to?

16 2.16. Even though E(S2) ≠ σ2, how can it be true nonetheless that ? Explain.

17 2.17. Explain why the following form of the sample variance is considered to be an unbiased estimator of the population variance:

18 2.18. Draw a distribution that is positively skewed. Now draw one that is negatively skewed.

19 2.19. Compare and contrast the covariance of a random variable: cov(xi, yi) = σxy = E[(xi − μx) (yi − μy)] with that of the sample covariance: How are they similar? How are they different? What in their definitions makes them different from one another?

20 2.20. What effect (if any) does increasing sample size n have on the magnitude of the covariance? If it does not have any effect, explain why it does not.

21 2.21. Explain or show how the variance of a variable can be conceptualized as the covariance of a variable with itself.

22 2.22. Cite three reasons why the covariance is not a pure or dimensionless measure of relationship between two variables.

23 2.23. Why is Pearson r not suitable for measuring relationships that are nonlinear? What is an alternative coefficient (one of many) that may be computed that is more appropriate for relationships that are nonlinear?

24 2.24. What does it mean to say the relationship between two variables is monotonically increasing?

25 2.25. What does a correlation matrix have along its main diagonal that a covariance matrix does not? What is along the main diagonal of a covariance matrix?

26 2.26. Define, in general, what it means to measure something.

27 2.27. Explain why it is that something measurable at the ratio level of measurement is also measurable at the interval, ordinal, and nominal levels as well.

28 2.28. Is something such as intelligence measurable on a ratio scale? Why or why not?

29 2.29. Distinguish between a mathematical variable and a random variable.

30 2.30. Distinguish between an estimator and an estimate.

31 2.31. Define what is meant by an interval estimator.

32 2.32. Define what is meant by the consistency of an estimator and what means in this context.

33 2.33. Compare the concepts of efficiency versus sufficiency with regard to estimators. How are they different?

34 2.34. The sampling distribution of the mean is an idealized distribution. However, discuss how one would generate the sampling distribution of the mean empirically.

35 2.35. Discuss why for a higher level of confidence, all else equal, a confidence interval widens rather than narrows.

36 2.36. Define what is meant by a maximum‐likelihood estimator.

37 2.37. Discuss the behavior of the t distribution for increasing degrees of freedom. What is the limiting form of the t distribution?

38 2.38. In a research setting, under what condition(s) is a t‐test usually preferred over a z‐test?

39 2.39. Verbally interpret the nature of pooling in the independent‐samples t‐test. Under what condition(s) do we pool variances? Under what condition(s) should we not pool?

40 2.40. Discuss why an estimate of effect size is required for estimating power.

41 2.41. Using R, estimate required sample size for detecting a population correlation coefficient of 0.30 at a significance level of 0.01, with power equal to 0.80.

42 2.42. Repeat exercise 2.41, this time using G*Power.

43 2.43. Using R, estimate power for an independent samples t‐test for a sample size of 100 per group and Cohen's d equal to 0.20.

44 2.44. For a value of r2 = 0.70, compute the corresponding value for d.

45 2.45. Discuss how the paired‐samples t‐test can be considered a special case of the wider and more general blocking design.

46 2.46. Define what is meant by a linear combination.

47 2.47. Define and describe each term in the multivariate general linear modelY = XB + E.

48 2.48. Discuss the key determinants of the p‐value in a significance test.

49 2.49. A researcher collects a sample of n = 10, 000 observations and tells you that with such a large sample size, he is guaranteed to reject the null hypothesis. Explain why the researcher's claim is false.

50 2.50. A researcher collects a sample size of n = 5, computes zM and rejects the null hypothesis. Argue on the one hand for why this might be impressive scientifically, then argue why it may not be.

51 2.51. Consider once more Galton's data on heights (only the first 10 observations are shown):> library(HistData)> attach(Galton)> Galton parent child 1 70.5 61.7 2 68.5 61.7 3 65.5 61.7 4 64.5 61.7 5 64.0 61.7 6 67.5 62.2 7 67.5 62.2 8 67.5 62.2 9 66.5 62.2 10 66.5 62.2(a) Compute a histogram of parent height, as well as an index of skewness and kurtosis. What do your measures of skewness and kurtosis suggest about the distribution?(b) Transform the distribution of child heights to z‐scores. What effect did such a transformation have on the mean and variance of the original distribution? Second, did it change its shape at all? Why or why not?(c) Compute the covariance between parent height and child height. Does the sign of the covariance suggest a positive or negative relationship?(d) Standardize the covariance by computing Pearson r. Interpret the obtained correlation coefficient, and test it for statistical significance using either SPSS or R.

52 2.52. Consider the following data on whether a student passed or failed a mathematics course (grade = 0 is “failed” and grade = 1 is “passed”), along with that student's study time for the course, in average minutes per day for the duration of the course: grade studytime 0 30 0 25 0 59 0 42 0 31 1 140 1 90 1 95 1 170 1 120Conduct an independent‐samples t‐test on this data using SPSS and R. Verify that the assumption of homogeneity of variances is met in SPSS.

53 2.53. A researcher is interested in conducting a two‐sample t‐test between a treatment group and a control group. The researcher anticipates an effect size of approximately d = 1.5 and wishes to test the null hypothesis μ1 = μ2 at a significance level of 0.05. Estimate required sample size assuming the researcher wishes to attain power of at least 0.90 for her test of the null hypothesis.

Applied Univariate, Bivariate, and Multivariate Statistics

Подняться наверх