Читать книгу Applied Univariate, Bivariate, and Multivariate Statistics Using Python - Daniel J. Denis - Страница 11

Statistical Knowledge vs. Software Knowledge

Оглавление

Having now taught at both the undergraduate and graduate levels for the better part of fifteen years to applied students in the social and sometimes natural sciences, to the delight of my students (sarcasm), I have opened each course with a lecture of sorts on the differences between statistical vs. software knowledge. Very little of the warning is grasped I imagine, though the real-life experience of the warning usually surfaces later in their graduate careers (such as at thesis or dissertation defenses where they may fail to understand their own software output). I will repeat some of that sermon here. While this distinction, historically, has always been important, it is perhaps no more important than in the present day given the influx of computing power available to virtually every student in the sciences and related areas, and the relative ease with which such computing power can be implemented. Allowing a new teen driver to drive a Dodge Hellcat with upward of 700 horsepower would be unwise, yet newcomers to statistics and science, from their first day, have such access to the equivalent in computing power. The statistician is shaking his or her head in disapproval, for good reason. We live in an age where data analysis is available to virtually anybody with a laptop and a few lines of code. The code can often easily be dug up in a matter of seconds online, even with very little software knowledge. And of course, with many software programs coding is not even a requirement, as windows and GUIs (graphical user interfaces) have become very easy to use such that one can obtain an analysis in virtually seconds or even milliseconds. Though this has its advantages, it is not always and necessarily a good thing.

On the one hand, it does allow the student of applied science to “attempt” to conduct his or her data analyses. Yet on the other, as the adage goes, a little knowledge can be a dangerous thing. Being a student of the history of statistics, I can tell you that before computers were widely available, conducting statistical analyses were available only to those who could drudge through computations by hand in generating their “output” (which of course took the form of paper-and-pencil summaries, not the software output we have today). These computations took hours upon hours to perform, and hence, if one were going to do a statistical analysis, one did not embark on such an endeavor lightly. That does not mean the final solution would be valid necessarily, but rather folks may have been more likely to give serious thought to their analyses before conducting them. Today, a student can run a MANOVA in literally 5 minutes using software, but, unfortunately, this does not imply the student will understand what they have done or why they have done it. Random assignment to conditions may have never even been performed, yet in the haste to implement the software routine, the student failed to understand or appreciate how limiting their output would be. Concepts of experimental design get lost in the haste to produce computer output. However, the student of the “modern age” of computing somehow “missed” this step in his or her quickness to, as it were, perform “advanced statistics.” Further, the result is “statistically significant,” yet the student has no idea what Wilks’s lambda is or how it is computed, nor is the difference between statistical significance and effect size understood. The limitations of what the student has produced are not appreciated and faulty substantive (and often philosophically illogical) conclusions follow. I kid you not, I have been told by a new student before that the only problem with the world is a lack of computing power. Once computing power increases, experimental design will be a thing of the past, or so the student believed. Some incoming students enter my class with such perceptions, failing to realize that discovering a cure for COVID-19, for instance, is not a computer issue. It is a scientific one. Computers help, but they do not on their own resolve scientific issues. Instructors faced with these initial misconceptions from their students have a tough road to hoe ahead, especially when forcing on their students fundamental linear algebra in the first two weeks of the course rather than computer code and statistical recipes.

The problem, succinctly put, is that in many sciences, and contrary to the opinion you might expect from someone writing a data analysis text, students learn too much on how to obtain output at the expense of understanding what the output means or the process that is important in drawing proper scientific conclusions from said output. Sadly, in many disciplines, a course in “Statistics” would be more appropriately, and unfortunately, called “How to Obtain Software Output,” because that is pretty much all the course teaches students to do. How did statistics education in applied fields become so watered down? Since when did cultivating the art of analytical or quantitative thinking not matter? Faculty who teach such courses in such a superficial style should know better and instead teach courses with a lot more “statistical thinking” rather than simply generating software output. Among students (who should not necessarily know better – that is what makes them students), there often exists the illusion that simply because one can obtain output for a multiple regression, this somehow implies a multiple regression was performed correctly in line with the researcher’s scientific aims. Do you know how to conduct a multiple regression? “Yes, I know how to do it in software.” This answer is not a correct answer to knowing how to conduct a multiple regression! One need not even understand what multiple regression is to “compute one” in software. As a consultant, I have also had a client or two from very prestigious universities email me a bunch of software output and ask me “Did I do this right?” assuming I could evaluate their code and output without first knowledge of their scientific goals and aims. “Were the statistics done correctly?” Of course, without an understanding of what they intended to do or the goals of their research, such a question is not only figuratively, but also literally impossible to answer aside from ensuring them that the software has a strong reputation for accuracy in number-crunching.

This overemphasis on computation, software or otherwise, is not right, and is a real problem, and is responsible for many misuses and abuses of applied statistics in virtually every field of endeavor. However, it is especially poignant in fields in the social sciences because the objects on which the statistics are computed are often statistical or psychometric entities themselves, which makes understanding how statistical modeling works even more vital to understanding what can vs. what cannot be concluded from a given statistical analysis. Though these problems are also present in fields such as biology and others, they are less poignant, since the reality of the objects in these fields is usually more agreed upon. To be blunt, a t-test on whether a COVID-19 vaccine works or not is not too philosophically challenging. Finding the vaccine is difficult science to be sure, but analyzing the results statistically usually does not require advanced statistics. However, a regression analysis on whether social distancing is a contributing factor to depression rates during the COVID-19 pandemic is not quite as easy on a methodological level. One is so-called “hard science” on real objects, the other might just end up being a statistical artifact. This is why social science students, especially those conducting non-experimental research, need rather deep philosophical and methodological training so they do not read “too much” into a statistical result, things the physical scientist may never have had to confront due to the nature of his or her objects of study. Establishing scientific evidence and supporting a scientific claim in many social (and even natural) sciences is exceedingly difficult, despite the myriad of journals accepting for publication a wide variety of incorrect scientific claims presumably supported by bloated statistical analyses. Just look at the methodological debates that surrounded COVID-19, which is on an object that is relatively “easy” philosophically! Step away from concrete science, throw in advanced statistical technology and complexity, and you enter a world where establishing evidence is philosophical quicksand. Many students who use statistical methods fall into these pits without even knowing it and it is the instructor’s responsibility to keep them grounded in what the statistical method can vs. cannot do. I have told students countless times, “No, the statistical method cannot tell you that; it can only tell you this.”

Hence, for the student of empirical sciences, they need to be acutely aware and appreciative of the deeper issues of conducting their own science. This implies a heavier emphasis on not how to conduct a billion different statistical analyses, but on understanding the issues with conducting the “basic” analyses they are performing. It is a matter of fact that many students who fill their theses or dissertations with applied statistics may nonetheless fail to appreciate that very little of scientific usefulness has been achieved. What has too often been achieved is a blatant abuse of statistics masquerading as scientific advancement. The student “bootstrapped standard errors” (Wow! Impressive!), but in the midst of a dissertation that is scientifically unsound or at a minimum very weak on a methodological level.

A perfect example to illustrate how statistical analyses can be abused is when performing a so-called “mediation” analysis (you might infer by the quotation marks that I am generally not a fan, and for a very good reason I may add). In lightning speed, a student or researcher can regress Y on X, introduce Z as a mediator, and if statistically significant, draw the conclusion that “Z mediates the relationship between Y and X.” That’s fine, so long as it is clearly understood that what has been established is statistical mediation (Baron and Kenny, 1986), and not necessarily anything more. To say that Z mediates Y and X, in a real substantive sense, requires, of course, much more knowledge of the variables and/or of the research context or design. It first and foremost requires defining what one means by “mediation” in the first place. Simply because one computes statistical mediation does not, in any way whatsoever, justify somehow drawing the conclusion that “X goes through Z on its way to Y, or anything even remotely similar. Crazy talk! Of course, understanding this limitation should be obvious, right? Not so for many who conduct such analyses. What would such a conclusion even mean? In most cases, with most variables, it simply does not even make sense, regardless of how much statistical mediation is established. Again, this should be blatantly obvious, however many students (and researchers) are unaware of this, failing to realize or appreciate that a statistical model cannot, by itself, impart a “process” onto variables. All a statistical model can typically do, by itself, is partition variability and estimate parameters. Fiedler et al. (2011) recently summarized the rather obvious fact that without the validity of prior assumptions, statistical mediation is simply, and merely, variance partitioning. Fisher, inventor of ANOVA (analysis of variance), already warned us of this when he said of his own novel (at the time) method that ANOVA was merely a way of “arranging the arithmetic.” Whether or not that arrangement is meaningful or not has to come from the scientist and a deep consideration of the objects on which that arrangement is being performed. This idea, that the science matters more than the statistics on which it is applied, is at risk of being lost, especially in the social sciences where statistical models regularly “run the show” (at least in some fields) due to the difficulty in many cases of operationalizing or controlling the objects of study.

Returning to our mediation example, if the context of the research problem lends itself to a physical or substantive definition of mediation or any other physical process, such that there is good reason to believe Z is truly, substantively, “mediating,” then the statistical model can be used as establishing support for this already-presumed relation, in the same way a statistical model can be used to quantify the generational transmission of physical qualities from parent to child in regression. The process itself, however, is not due to the fitting of a statistical model. Never in the history of science or statistics has a statistical model ever generated a process. It merely, and potentially, has only described one. Many students, however, excited to have bootstrapped those standard errors in their model and all the rest of it, are apt to draw substantive conclusions based on a statistical model that simply do not hold water. In such cases, one is better off not running a statistical model at all rather than using it to draw inane philosophically egregious conclusions that can usually be easily corrected in any introduction to a philosophy of science or research methodology course. Abusing and overusing statistics does little to advance science. It simply provides a cloak of complexity.

So, what is the conclusion and recommendation from what might appear to be a very cynical discussion in introducing this book? Understanding the science and statistics must come first. Understanding what can vs. cannot be concluded from a statistical result is the “hard part,” not computing something in Python, at least not at our level of computation (at more advanced levels, of course, computing can be exceptionally difficult, as evidenced by the necessity of advanced computer science degrees). Python code can always be looked up for applied sciences purposes, but “statistical understanding” cannot. At least not so easily. Before embarking on either a statistics course or a computation course, students are strongly encouraged to take a rigorous research design course, as well as a philosophy of science course, so they might better appreciate the limitations of their “claims to evidence” in their projects. Otherwise, statistics, and the computers that compute them, can be just as easily misused and abused as used correctly, and sadly, often are. Instructors and supervisors need to also better educate students on the reckless fitting of statistical models and computing inordinate amounts of statistics without careful guidance on what can vs. cannot be interpreted from such numerical measures. Design first, statistics second.

Statistical knowledge is not equivalent to software knowledge. One can become a proficient expert at Python, for instance, yet still not possess the scientific expertise or experience to successfully interpret output from data analyses. The difficult part is not in generating analyses (that can always be looked up). The most important thing is to interpret analyses correctly in relation to the empirical objects under investigation, and in most cases, this involves recognizing the limitations of what can vs. cannot be concluded from the data analysis.

Applied Univariate, Bivariate, and Multivariate Statistics Using Python

Подняться наверх