Читать книгу Business Experiments with R - B. D. McCullough - Страница 32
1.4.3 Most Experiments Fail
ОглавлениеIt is important to remember that the purpose of an experiment is to test some idea, not prove something and also that most experiments fail! This may sound depressing, but it is hugely effective if you can create a process that allows bad ideas to fail quickly and with minimal investment:
“[Our company has] tested over 150 000 ideas in over 13 000 MVT [multivariate testing] projects during the past 22 years. Of all the business improvement ideas that were tested, only about 25 percent (one in four) actually produced improved results; 53 percent (about half) made no difference (and were a waste of everybody's times); and 22 percent (that would have been implemented otherwise) actually hurt the results they were intended to help” (Holland and Cochran, 2005, p. 21).
“Netflix considers 90% of what they try to be wrong” (Moran, 2007, p. 240).
“I have observed the results of thousands of business experiments. The two most important substantive observations across them are stark. First, innovative ideas rarely work When a new business program is proposed, it typically fails to increase shareholder value versus the previous best alternative” (Manzi, 2012, p. 13).
Writing of the credit card company Capital One (Goncalves, 2008, p. 27): “We run thirty thousand tests a year, and they all compete against each other on the basis of economic results. The vast majority of these experiments fail, but the ones that succeed can hit very big[.]”
“Given a ten percent chance of a 100 times payoff, you should take that bet every time. But you're still going to be wrong nine times out of ten.” Amazon CEO Jeff Bezos wrote this in his 2016 letter to shareholders.
“Economic development builds on business experiments. Many, perhaps most experiments fail” (Eliasson, 2010, p. 117).
You are not going to get useful results from most of the experiments that you conduct. But, as Thomas Edison said of inventing the lightbulb, “I have not failed. I've just found 10 000 ways that didn't work.” Failed experiments are not worthless; they can contain much useful information: Why didn't the experiment work? Did we maintain false assumptions? Was the design faulty? Everything learned from a failed experiment can help make the next experiment better.
When dealing with human subjects, where response sizes are small and there are lots of noise, there can be a tendency toward false positives (especially when sample sizes are small!), so follow‐up experiments of small sample experiments are important to document that the discovered effect really exists.
Even with large samples, it is best to make sure that a discovered effect really exists. In webpage development, an experiment to optimize a webpage might prove fruitful, yet the improvement will not immediately be rolled out to all users. Instead, it might be rolled out to 5% of users to guard against the possibility that some unforeseen development might render the improvement futile or worse, harmful. Only after it has been deemed successful with the 5% sample will it be rolled out to all users.