Читать книгу Interpreting and Using Statistics in Psychological Research - Andrew N. Christopher - Страница 14

Representativeness heuristic

Оглавление

The availability heuristic is one barrier to using statistical information. There is another heuristic we use that also makes it difficult to use statistical information. To start this discussion, let me to tell you about my cousin Adam. When he was a toddler, he lived in a house with two dogs. So when Adam saw any four-legged, furry creature, he called it “doggie.” One day when we were at a petting zoo, he saw what I, as a teenager, knew was a horse. But to him, it was a “big doggie.” And indeed, dogs and horses do share some outward similarities (e.g., four legs, fur, and a tail). One difference between dogs and horses is that whereas dogs bark, horses neigh.

So when this “big doggie” neighed, Adam looked most perplexed. That sound did not fit in his mental notion of “dog.” He was forced to change his mental picture of what a dog was, and in addition, he needed to create a new, distinct mental category for this creature he had encountered called a “horse.” In this example, Adam was using the representativeness heuristic (Gilovich & Savitsky, 2002). That is, he had created a mental category of “dog” that included all animals with four legs, fur, and a tail.

Representativeness heuristic: judging how likely something or someone is to the typical instance of a mental category that we hold; can lead us to ignore other relevant information.

So what’s the problem that representativeness plays in our thinking? Those mental categories had to come from somewhere, and indeed, they are often correct or else we would stop using them. In Adam’s situation, the sound that the horse made forced him to redefine his mental category of “dog.” This may not be too difficult to do at least in theory. But remember, I want you to be aware of when our thinking goes awry and how such missteps are rooted in statistical thinking.


Photo 1.7 They look similar, don’t they?

Source: ©iStockphoto.com/GlobalP; ©iStockphoto.com/fotojagodka

There are two potentially problematic results of using representativeness that we will discuss. First, the base-rate fallacy is the tendency to ignore information that describes (i.e., represents) most people or situations. Rather, we rely on information that fits a mental category we have formed (Bar-Hillel, 1980). To take a simple example, approximately 90% of the students at my college are from Michigan, Indiana, and Ohio. At first-year orientation this fall, I talked with a tall, athletic-looking, suntanned student who had long blond hair and was wearing a Ron Jon Surf Shop® (Cocoa Beach, FL) T-shirt. Where was he from? California? Florida? Perhaps. But without knowing any additional information, you have a 90% chance of being correct (assuming you say “Michigan,” “Indiana,” or “Ohio”). Even though the description seems to fit someone from California or Florida, those states are sparsely represented in our student body. Thus, there was minimal chance he was from one of those places despite fitting our mental category of “Californian” or “Floridian.” Let’s explore the base-rate fallacy in a little more detail.

Base-rate fallacy: tendency to prefer information derived from one’s experience and ignore information that is representative of most people or situations.

When we started this chapter, I lamented that we as humans often have difficulty thinking statistically. Again, 90% of the student population at my college is from three states. Therefore, the probability of a student being from any of the other 47 states or another country is low. That probability is even lower for any one specific state of those 47. However, in this instance, the only thing I “saw” was that one student I talked with at orientation. He was a sample of the entire student body at my college. The entire study body at my college consists of people primarily from three states. Therefore, even though he fit my mental representation of “Californian” or “Floridian,” the odds are that he was from Michigan, Indiana, or Ohio.

One danger, in terms of statistical thinking, of our everyday experiences is that rarely, if ever, do we have all of the information about a given situation (we are egocentric, remember). Much as we can rarely, if ever, be familiar with everyone in a large group of people, we rely on our personal experiences to draw conclusions about the world. An extension of the base-rate fallacy is the law of small numbers, which is the second potential problem with using representativeness. The law of small numbers holds that results based on a small number of observations are less likely to be accurate than are results based on a larger number of observations (Asparouhova, Hertzel, & Lemmon, 2009; Taleb, 2004). We assume that our experiences are representative of the larger world around us when, in fact, that is not always the case. For instance, when you toss a coin, there is a 50% chance the coin will land on heads and a 50% chance it will lands on tails. If you flip that coin four times, you would expect it to land on heads twice and on tails twice. That would be 50/50, just the way coin tosses should turn out. However, with only four flips of the coin, weird things might happen. You might flip three tails but only one heads. Or maybe all four flips will be heads. Does this mean the coin is “fixed”? No. Rather, with such a small number of flips (i.e., a small sample), you might get outcomes that are markedly different than what you would expect to find (i.e., all coin flips in history). Flip a coin 20 times. I bet you will not get exactly 10 heads and 10 tails, but overall, it should be closer to 50/50 than 75/25. Now flip the coin 40 times, and again, you are likely to be closer to 50/50 than you were with 20 flips.2

Law of small numbers: results based on small amounts of data are likely to be a fluke and not representative of the true state of affairs in the world.

Let’s take another, more mundane example of the law of small numbers. You are thinking about where to go for dinner tonight. Your roommate said a friend of hers really liked the local pizza place. Based on this information, you decide to have dinner at the local pizza place. How is this instance an example of the law of small numbers? Let me ask you, how much information did you gather to make your decision? You have a suggestion from one friend of your roommate; that is all. So, with one piece of data, you drew your conclusion of where to eat dinner. Let’s hope your food preferences are similar to those of your roommate’s friend. Had you read reviews of this restaurant, you would have had more data on which to base your decision of where to eat.

As one real-world example of the law of small numbers, many people are afraid to invest their money in stocks because they think bonds are a safer investment. However, a great deal of research (e.g., Index Fund Advisors, 2014) has demonstrated that over the long term, investing in stocks is the best way to grow one’s money. Since 1928, the U.S. stock market’s average annual return has been about 9.6%. During that same time span, U.S. government long-term bonds have grown on average by only about 5.4% each year. So, when deciding where to invest our money, clearly it should go into the stock market, right? Maybe. Keep in mind that 1928 was a long, long time ago. Over a long period of time, then yes, the stock market has indeed been the best investment available. However, you know enough about history to know what happened in October 1929, again in October 1987, and in the fall of 2008. There are comparatively small pockets of time during which stocks do poorly, sometimes disastrously so. These time periods are the exceptions, but if it is your money being lost when stocks decline in value, you probably will not take comfort in this knowledge. Therefore, money you need in the near future probably should not be invested too heavily in stocks because during brief (i.e., small) periods of time, the value of stocks can decline, sometimes precipitously. However, money you do not need for a longer period of time probably is better invested in stocks than in bonds.

Now that we have learned some reasons why we tend not to incorporate statistical information into our thinking, let’s distinguish among three concepts that we’ve already implicitly touched on and that are foundational for statistical thinking: a population, a variable, and a sample. These are not complicated distinctions, but they are critical in this course and in being better consumers of statistical information. A population is the entire group of people you want to draw conclusions about. In our example of stock market and bond market gains, the population would be every year since 1928. In our example of where to eat on Friday night, the population would be every person who’s eaten at that restaurant. All members of a population must have some characteristic in common. In research, such a characteristic is called a variable. A variable is a quality that has different values or changes in the population. For instance, qualities such as height, age, personality, happiness, and intelligence each differ among people; hence, each is a variable. Variables can also be environmental features, such as classroom wall color or investment returns.


Photo 1.8 Stocks generally rise in value. The highlighted pockets of time note the exceptions to this historical trend.

Source: ©Macrotrends LLC/“Down Jones – 100 Year Historical Chart”

For most research studies, and for most situations in life more generally, it is impossible to examine or be familiar with each and every member of the population. Therefore, we make particular observations from the population, and based on that sample, we draw conclusions about the population (see Figure 1.1). One year could be a sample of stock market and bond market gains (or depending on the one year in question, losses). Your roommate’s friend is the sample in that example. Had you read reviews of the restaurant, doing so would have provided a larger sample of data on which to base your decision on where to eat dinner.


Figure 1.1 Relationship Between and Purposes of a Population and a Sample

Population: entire group of people you want to draw conclusions about.

Variable: characteristic that has different values or changes among individuals.

Sample: subset of people from the population that is intended to represent the characteristics of the larger population.

Let’s return to two other previous examples. Regarding casinos advertising the people who won money gambling at their facilities, the population would be all people who gamble in casinos. The sample would be the winners that are in the advertisements. Based only on this sample, it appears that winning money in casinos is normal (which of course it is not; it is just the opposite, actually). In our example of the base-rate fallacy and the “surfer-looking” student at my college, the population would be all students at my college, 90% of whom are from either Michigan, Ohio, or Indiana. The sample is the one student who looks like he is a surfer, which is something we associate with people from California or Florida more than with people from these other three states. He was one of about 1,500 students in the population. Given his appearance, it would be easy to assume he was from somewhere other than what is indicative of the population he was a member of.3

Interpreting and Using Statistics in Psychological Research

Подняться наверх