Читать книгу Medical Statistics - David Machin - Страница 104

4.4 Probability for Continuous Outcomes

Оглавление

So far, we have looked at what is the probability of a particular value, for example, a success or failure on treatment. The Binomial and Poisson distributions are discrete distributions that describe discrete variables that can only take a limited set of values. As the number of possible values increases the probability of any particular value decreases. Continuous probability distributions are distributions that can take any value between given limits. For continuous variables, such as birth weight and blood pressure, the set of possible values is infinite (only limited by the precision of how were take the measurements). So, we are more interested in the probability of having values between certain limits rather than one particular value. For example, what is the probability of having a systolic blood pressure of 140 mmHg or higher?

The vertical scale of histograms, such as Figure 2.6, shown so far, have been frequencies and depend on the total number of observations. As an alternative we can use the relative frequency (or %) on the vertical scale. The advantage of using the relative frequency is that the scale of different histograms, with the same outcome but different sample sizes, will be the same. Such a histogram, as in Figure 4.7 can be given the rather formal name of an empirical relative frequency distribution but it is simply the observed distribution of the data in a sample.

Figure 4.7 Empirical relative frequency distributions of birth weight of 98 babies admitted to special care baby unit and the associated probability distribution.

(Source: data from Simpson 2004). Reproduced by permission of AG Simpson.

If we imagine for the birthweight data in Figure 4.7 that we have a very large sample (many more than 98 babies) and by taking smaller and smaller intervals to classify the birth weights (much smaller than 0.25 kg) then the histogram will start to look like a smooth curve (see Figure 4.8). In these circumstances the distribution of observations may be approximated by a smooth underlying curve, which is also shown in Figure 4.7. This curve is called a probability distribution and is the theoretical equivalent of an empirical relative frequency distribution. Probability distributions are used to calculate the probability that different values will occur, for example: what is the probability of having a birthweight of 2.0 kg or less? It is often the case with medical data that the histogram of a continuous variable obtained from a single measurement on different subjects will have a symmetric ‘bell‐shaped’ distribution.


Figure 4.8 Empirical relative frequency distributions of birthweight with interval (bin) widths of 0.5, 0.25, 0.2, and 0.1 kg

Medical Statistics

Подняться наверх