Читать книгу The Black Swan Problem - Håkan Jankensgård - Страница 11

THE NATURE OF RANDOMNESS

Оглавление

Randomness refers to unpredictability. It applies whenever the outcome for some variable, such as the number of visitors to the Louvre on a given weekday, cannot be known with certainty beforehand. It is a function of our inability to know and predict the future. Try as we might, we never seem able to build those perfect forecasting algorithms that get it right all the time. In fact, as Taleb is at pains to point out, our overall track record in forecasting is awful (more on this later).

Why is there a general failure to predict what the future will bring? To answer this question, first consider that one very basic source of randomness is the physical world itself, which is constantly changing through processes that we do not fully comprehend. Science marches on, chipping away at the ignorance that produces apparent randomness. But despite the many laws of nature that have been uncovered, we never know where the next lightning will strike or how ocean currents will respond to changes in melting ice sheets. In the end, there are too many variables and too many complicated feedback loops in these highly dynamic systems. On top of that there is human civilization itself. While once rudimentary and mostly local, over time society has become complex beyond imagination. Technical innovations have made possible advanced systems that increasingly connect people across different parts of the globe. It is fundamentally unknowable what outcomes these vast and interconnected systems of interacting people and technologies will produce. Human agency by itself ensures why the future keeps bringing so many surprises, as the 9/11 attack illustrates. It should be clear that we are up against a complexity that is beyond our ability to predict successfully.

The difficulties we face in predicting the future is related to the problem of induction, a classic problem in philosophy. While data can certainly teach us a great deal about the workings of the world, the philosopher and sceptic David Hume made us realize that we cannot arrive at secure knowledge on the basis of empirical observations. The problem of induction says that no matter how many observations you obtain, you cannot know for sure that the observed pattern is going to hold in the future. This inherent limitation is at the heart of the Black Swan concept. Any knowledge obtained through observation, Taleb says, is fragile. It is what the Black Swan metaphor itself is meant to convey. Recall that millions of observations on white swans had seemingly verified the notion that all swans are white, and it only took one observation of a black one to falsify it. Along the same lines, Peter Bernstein (1996) observed in his epic story about risk that: ‘… history repeats itself, but only for the most part2 (emphasis added). This sentence really sums it all up and explains why induction is treacherous ground for making assumptions about the future.

Once we capitulate to the fact that we cannot predict the future, the next best thing would be to be able to characterize randomness itself, i.e. describe it. In that way, we would have some idea about the scope for deviations from what we expect. A description of randomness would involve some degree of quantification of things like the range within which the values of a variable can be assumed to fall and how the outcomes are distributed within that range (frequencies). We might occasionally find such descriptions of random processes to be practically relevant insofar as they help us make informed decisions and our future wellbeing depends on the outcome of the variable in question. They are potentially helpful, for example, in coming up with a reasonable analysis of the trade‐off between risk and return in different kinds of investment situations.

When characterizing randomness, a useful first distinction is between uncertainty and known odds.3 Uncertainty simply means that the odds are not known, indeed cannot be known. When randomness is of this sort, there is no way of knowing with certainty the range of outcomes and their respective probabilities. Known odds, in contrast, means that we have fixed the range of outcomes and the associated probabilities. The go‐to example is the roll of a dice, in which the six possible outcomes have equal probabilities. Drawing balls with different colours out of an urn is another favourite textbook example of controlled randomness.

Uncertainty, it turns out, is what the world has to offer. In fact, known odds hardly exist outside man‐made games. This is the case for exactly the same reasons that forecasting is generally unsuccessful: there are some hard limits to our theoretical knowledge of the world.4 There is ample data, for sure, which partly makes up for it. But the world generates only one observable outcome at a time, out of an infinite number of possibilities, through mechanisms and interactions that are beyond our grasp. There is nothing to say that we should be able to objectively pinpoint the odds of real‐world phenomena. Whenever a bookie, for example, offers you odds on the outcome of the next presidential election, it is a highly subjective estimate (tweaked in favour of the bookie).

Whenever data exists, it is of course possible to try to use it to come up with descriptions of the randomness in a stochastic process. Chances are that we can ‘fit’ the data to one of the many options available in our library of theoretical probability distributions. Once we have, we have seemingly succeeded in our quest to describe randomness, or to turn it into something resembling known odds. This is the frequentist approach to statistical inference, in which observed frequencies in the data provide the basis for probability approximations. Failure rates for a certain kind of manufacturing process, for example, can serve as a reasonably reliable indication of the probability of failure in the future.

It is important to see, however, that even when we are able to work with large quantities of data, we are still in the realm of uncertainty. The data frequencies typically only approximate one of the theoretical distributions. What is more, the way we collect, structure, and analyse these data points determines how we end up characterizing the random process and therefore the probabilities we assign to different outcomes. To the untrained eye, they might seem like objective and neutral probabilities because they are data‐driven and obtained by ‘scientists’. However, there is always some degree of subjectivity involved in the parameterization. The model used to describe the process could end up looking different depending on who designs it. Hand a large dataset over to ten scientists and ask them what the probability of a certain outcome is, and you may well get ten different answers. Because of the problem of induction, as discussed, there is always the possibility that the dataset, i.e. history, is a completely misleading guide to the future. Whenever we approximate probabilities using data, we assume that the data points we use are representative for describing the future.

The Black Swan Problem

Подняться наверх