Читать книгу Practical Field Ecology - C. Philip Wheater - Страница 53
Statistical considerations in project design
ОглавлениеSince research is about asking questions, you need to design your project so that it will answer them effectively, without allowing your design to introduce ambiguous results, or results that are open to other interpretations. This is where the planning phase starts to define what you are going to measure and how. If, for example, we investigate the types of birds found inhabiting a woodland patch, then we have a choice of ways in which we record the data. We might note how many individual birds there are, or the numbers of each feeding type (insect feeders, seed feeders, etc.), or how many individuals there are in each species. These measurements enable us to obtain a picture of the birds found in a woodland patch. If we monitor birds only in a single woodland patch, we could worry that our chosen woodland is unusual in some way and therefore not representative of woodland patches in general. We could therefore examine a series of patches and obtain data for 10 or more of these. Now, if we wish to describe how many birds were found in all of these woodlands, we require some sort of descriptive statistic to summarise the information across 10 or more patches. Descriptive techniques include estimates of the average values per sampling unit (e.g. per site), population estimates and densities, methods of describing distributions (i.e. whether organisms are distributed randomly, evenly or in aggregations), and measures of community richness including diversity and evenness indices. These techniques are discussed in more detail in Chapter 5.
Most projects go beyond a simple description of particular species and sites in an attempt to make comparisons or generalisations that can hopefully have wider applicability. For example, if we decide to investigate whether the number of animals found under decaying logs on a woodland floor is influenced by the size of the log, we might approach this in one of three basic ways:
1. by looking at possible differences between samples; for example, if the logs were easily divided into two classes (large and small, i.e. <20 cm and ≥20 cm), we could compare the numbers of animals found under each size class; | |
2. by looking at possible relationships between variables; for example, we might have a wide range of sizes of logs and decide to examine whether the number of animals varies in some systematic way (either increasing or decreasing) as log size increases; | |
3. by looking at possible associations between frequency distributions; for example, we could compare the frequency of predators, herbivores, decomposers, etc. from under each of two size classes of logs (i.e. <20 cm and ≥20 cm). |
From this simple example it can be seen that how we ask the question has an impact on how we design our study. The three different ways of looking at this study (listed above) also illustrate three broad (basic) types of statistical questions: differences, relationships, and associations between frequency distributions. We will examine each type of question in a little more detail later in this chapter (p. 36), whilst the analytical techniques needed to answer these questions are described in Chapter 5. Other questions that might be asked include: looking at the similarity of sites based on their species composition (i.e. are the animals found under logs from different tree species similar in species composition to each other?); or predicting the presence or numbers of a species from a knowledge of the environmental conditions (e.g. is the presence of wood ants' nests predictable if we know the woodland type, topography, microclimate, etc.?).
By careful design, we strive to ensure that our study does not produce ambiguous results. For example, in a comparison of the invertebrate diversity between urban ponds and rural ponds we could aim to include the size of each pond studied into the survey design. If we did not manage this, and found that the rural ponds surveyed happened to be both larger and contain more invertebrates, it would not be clear whether the results were due to rural ponds being more diverse or whether it was simply an effect of pond size. The correct experimental design would be to either standardise on a given pond size for both environments, or make sure that the full range of pond sizes was included in both environments. Pond size would then be measured, recorded, and built into the subsequent analysis: pond size is then an example of a ‘covariate’. Other factors that would have to be standardised, or at least recognised as covariates, in this particular study would be the quality of the water, the pH, age of pond, and so on.
The goal of the study may be to get a deeper understanding of the system by gathering a wide range of variables. In the pond survey example, this might mean that in addition to pond size, we should take various measures of water quality and chemistry (nutrient status, oxygen content, pH, etc.) and the numbers of each species of plant and animal. For a given number of ponds, there may be a large number of variables giving rise to a complex datasheet. In this example, each pond would have its own row in a spreadsheet, and each variable (e.g. size, pH, number of species) would be a column. In order to examine and make sense of such a complex data set, we would need to move into the realm of multivariate analysis (see Chapter 5).