Читать книгу Practical Field Ecology - C. Philip Wheater - Страница 61
Box 1.8 Species accumulation curves for two sites
ОглавлениеBy plotting the cumulative number of species found against the number of quadrats examined, it can be seen that as the number of quadrats used increases, the number of species also increases. At the point at which the curve levels off towards the horizontal (the asymptote), we may assume that we have obtained the maximum number of species and can stop sampling. For site A (dashed line, diamonds), we may not yet have reached the total number of species, even after 30 quadrats, and should consider increasing the sampling effort. For site B (dotted line, squares), it appears that we have reached about the maximum number of species that we can expect to get. In fact, we probably reached this number at round about 16 or so quadrats. This difference between sites A and B might reflect not only a difference in the number of species found there, but also a difference in heterogeneity of the site, with site A being less homogeneous than site B. Note that had we looked at the data for site A after 12 quadrats (solid line, diamonds), we might have assumed that we had reached the maximum number of species as the curve levels off. This highlights the importance of collecting past the initial point of curve levelling to check that it truly does reflect the asymptote.
Since we generally take a sample in order to make a valid estimate of a parameter of the population (e.g. the number of species, the mean temperature, the proportion of predators), a central requirement is that the individuals sampled are independent of each other. It is important to recognise, and avoid or if not account for, situations where the individuals sampled are linked in some way as a result of the sampling design. For example, we might compare the number of spangle galls found on leaves chosen at random on oak trees growing in clumps, with those on isolated oak trees. If we found over 20 trees in separate clumps, but only 10 isolated trees, we might be tempted to take double the measurements from each of the individual isolated trees. However, this would mean that individual data points from isolated trees were linked by virtue of the tree on which they were growing and shared many different attributes with each other. Such data would not be independent of each other (known as pseudoreplicates) and hence may cause problems in interpretation since we would be unsure whether any differences between clumped and isolated trees were due to the multiple measurements from some trees. It would be better to use unbalanced sample sizes (i.e. 20 clumped and 10 isolated trees) than use non‐independent data. Similarly, we should not take data from more than one tree in any clump since these are likely to be more similar to each other than to those in other clumps. From a statistical analysis point of view, few tests require equal sample sizes and, even where this is a problem, it would be preferable to reduce the number of trees from clumps that were measured. Note that we may wish to take account of some of the variation between leaves on each tree by taking several (perhaps 10) leaves per tree and using a mean value to represent each tree. There are also statistical tests that allow for multiple measurements per tree, but these usually require the same number of samples per sampling unit – see repeated measures analysis in Chapter 5.
If we survey a pond in order to look at the animals and their relationships with several physical, chemical, and/or biological factors, then no matter how many replicates we take, we are merely describing what happens in a single entity (i.e. this one pond). Such a study does not tell us anything about pond ecology in general, and the use of such replicates is termed pseudoreplication and should be avoided (Hurlbert 1984; van Belle 2002). In order to broaden our approach and gain more of an understanding of ponds in general, we would need to study a large number of separate ponds. Thus, studies of single sites or small parts of sites may not reveal information applicable to the wider ecological context.
In some situations, the data collected are linked to each other by design. For example, we might be interested in comparisons of matched data (e.g. examining the animals found on cabbages before and after the application of fertiliser or pesticide, or the numbers of mayfly larvae found above and below storm drain outflows into a series of streams). These designs can be perfectly sound, but because the data are matched (by cabbage or by stream) we require a slightly different approach to the resulting analysis (see Chapter 5).
When designing your sampling strategy, it is important to consider the variability and whether the timing or order of sampling might bias the result by measuring only part of the potential variation. For example, sampling the insects present on thistle flower heads will be biased if all the data are collected in the early morning, since this will miss any animals that are active later in the day. If two areas are being compared, sampling one site early and one site later will introduce another variable into the comparison: we would not just be looking at the two sites, but also at two times of day. Since it would be impossible to separate the two variables, it would be difficult to draw conclusions from such a survey design. In this example we would say that the findings were ‘biased by time of day’. It is in managing some of this variability that experiments come into their own, because they standardise as far as possible the conditions under which the subjects are examined, thus removing bias. It is much easier to design an experiment where only one factor (also known as the treatment) is manipulated, whilst all others remain constant. However, if we wished to survey a real‐life situation (as opposed to examining a rather more artificial experimental design) then we would take into account the time of day. We could do this by designing our survey so that we alternated the measurements or observations that we took from our two sites, sampling first one then the other, then back to the first, and so on, to get a spread of measurements for each site over the day. Alternatively, we could sample on successive days, reversing the order in which we sampled the sites on each day, or justify the need to obtain additional fieldwork assistance to make a balanced study easier to implement.
Table 1.2 Random numbers. Coordinates can be extracted simply by taking pairs of random numbers in sequence from the table (e.g. 23, 85 – shaded values – provides the position within a sampling area where we would take the first measurement of a series).27
23 | 85 | 56 | 84 | 92 | 4 |
62 | 51 | 27 | 74 | 83 | 84 |
56 | 32 | 87 | 75 | 95 | 5 |
87 | 7 | 20 | 30 | 25 | 12 |
99 | 86 | 29 | 41 | 29 | 39 |
31 | 73 | 30 | 73 | 27 | 97 |
24 | 38 | 91 | 16 | 17 | 66 |
94 | 59 | 12 | 17 | 37 | 39 |
41 | 67 | 25 | 42 | 2 | 84 |
32 | 67 | 48 | 99 | 74 | 3 |
68 | 1 | 59 | 20 | 25 | 7 |
There are several sampling layouts that help us to avoid bias. One commonly used approach is random sampling. Here, a random sequence is used to determine the order in which to sample plants, or the coordinates to sample experimental plots or survey sites. Hence, if we wanted to randomly sample 1 m × 1 m quadrats in a field, random coordinates can be used to position the sampling sites (Figure 1.4a) using pairs of random numbers generated using a calculator or computer, or obtained from a table (see Table 1.2). This works by using pairs of numbers as sampling coordinates, so if we have coordinates of 23 and 85 in a sampling grid that is 10 m by 10 m, we would place our quadrats 2.3 m along the base and 8.5 m up the vertical axis. In our example above, of insects on thistle flowers, random sampling may also be used to determine which site is visited first: here sites would be allocated number codes that are then selected randomly from the table.
Figure 1.4 Examples of sampling designs. (a) Random sampling; (b) systematic sampling; (c) stratified random sampling.
Although random sampling is often appropriate for selecting sampling points, where there is a great deal of variation across a sampling unit such as a site, by chance the coverage may not include all of the heterogeneity present. For example, in Figure 1.4a, the two squares in the lower right of the sampling site have no sampling points. If the site was reasonably homogeneous, then this would not be a problem. However, if these small squares represented the only damp area within the site (covering around 8% of the total area), then this particular habitat variation would have been missed altogether. An alternative strategy would be to use systematic sampling (Figure 1.4b). This is an objective method of spreading the sampling points across the entire area, thus dealing with any spatial heterogeneity. So, to systematically sample the insects on trees, we might collect from every tenth tree in a plantation.
Usually, systematic sampling would provide us with random individuals – unless for some reason every tenth individual is more likely to share certain characteristics. Suppose we used systematic sampling to examine the distribution of ants' nests in a grassland. We could place 2 m × 2 m quadrats evenly 10 m apart across the site and then count the number of nests within each quadrat. However, if ants' nests are in competition with each other, they are likely to be spaced out. If this spacing happens to be at about 10 m distances, we would either overestimate the number of nests if our sequence of samples included the nests, or underestimate if we just missed including nests in each quadrat. It would be better in this situation to use a mixture of random and systematic sampling (called stratified random sampling – Figure 1.4c) where the area was divided into blocks (say of 10 m × 10 m) and then the 2 m × 2 m quadrats were placed randomly within each of these. This type of sampling design can also be applied to temporal situations by, for example, dividing the day into blocks of 4 hours and allocating the order of the sites to be sampled within each block using different random numbers.
More sophisticated methods of laying out sampling plots (or allocating sampling periods) may be useful for planning experiments. We could lay out a series of treatments in rows so that we have replicates of each treatment (Figure 1.5a). Although this would be relatively easy to manage (adding fertiliser, dealing with particular cutting regimes, etc.) since each treatment is clustered together, there may be variability within the plot that masks the impacts of the treatment themselves. An alternative is to ensure that each row and each column of the plot has one of each treatment (see Figure 1.5b). It is even better if these treatments can be distributed randomly, whilst still maintaining an even spread across the rows and columns using a Latin square design (see Figure 1.5c). Variations on this theme have been proposed, including ones based on the patterns used in the Suduku game (Sarkar and Sinhar 2015).
Figure 1.5 Experimental layouts for five different treatments. (a) Clustered design; (b) stratified design; (c) Latin square design. Each treatment is represented by a different symbol.