Читать книгу Analysing Quantitative Data - Raymond A Kent - Страница 17
Properties
ОглавлениеEach case will be a configuration of a potentially infinite set of characteristics. In practice, researchers undertaking quantitative research will focus on a limited set of properties – variously described as ‘variables’, ‘set memberships’, ‘causes’, ‘effects’, ‘conditions’ or ‘outcomes’ – that are taken as a basis for recording characteristics. Properties are the characteristics of cases that are included in the research sample or population and that the researcher has chosen to observe or measure and then record. Some of these properties will be common to all the cases, for example all the cases in a sample of individuals are female, aged between 18 and 40 and resident in the UK. These properties are sometimes called ‘constants’ and they define which cases are considered to be members or potential members of the research population. Alternatively, some properties will relate to characteristics that vary between cases, for example some nation-states in a research population of states have parliamentary democracies and others do not.
The type of characteristic to which properties may relate can be broadly classified into demographic, behavioural or cognitive. Demographic properties relate to features that researchers have chosen to characterize the nature or condition of a case like a person’s age and sex, a household’s size, an organization’s legal status or a country’s rate of net immigration. These qualities may be fixed or relatively fixed (like gender or organizational legal status), or slow to change (like age). Some may be subject to sudden changes interspersed with periods of stability, for example an individual’s social, economic or marital status, or an organization’s location of company head offices.
Behavioural properties relate to what cases did in the recent past, to what they usually or currently do, or to what they might do in the future. Typical measures for individual consumers in marketing, for example, relate to the purchase and use of products and brands like purchase/non-purchase of a product or brand over a specific time period, brand variant purchased, quantity/size of pack, price paid, source of purchase, other brands bought, nature of purchase, and use/consumption of the product. These measures may, in turn, be used to generate calculations of brand loyalty, brand switching behaviour and frequency of purchase. If the research is a product test or product concept test, consumers may be asked about future behaviour, for example the likelihood of trial of a new product and likely frequency of purchase.
Cognitive properties relate to mental processes that go on within individuals and include their attitudes, opinions, beliefs and images. These are notoriously difficult to assess. Attitude scaling, which is explained in the next section, focuses on how researchers have attempted to address this problem. There is an issue of whether or not aggregations of individuals or macro units can ‘do’, ‘think’ or ‘believe’ things; that will be something on which researchers will need to take a view and decide.
In terms of the various roles that demographic, behavioural or cognitive properties may play in research, we can distinguish properties being used as descriptors and properties being used either as potentially causal factors or as outcomes. Descriptors are properties that are studied one at a time in order to illustrate or summarize the key features of a set of cases. They are not being investigated for their potential relationship to other characteristics. Demographic properties in particular are often used to provide a framework for defining and describing the key characteristics of the cases that are providing the data in a piece of research; for example, a sample of online shoppers may be described in terms of the numbers of males and females, the age distribution and whether or not they have access to broadband. However, behavioural and cognitive properties may also be used for descriptive purposes. Where, in a piece of research, properties are being used solely in this fashion, then it may be called a ‘descriptive’ study.
Alternatively, properties may be used precisely for the purpose of investigating the nature of their relationships to other properties. In some research the purpose of the study may be to explore whether or not, or the extent to which, patterns exist; for example, that males are more likely than females to instigate divorce proceedings. Most researchers, however, are interested in examining whether some properties, variously called ‘conditions, ‘independent variables’ or ‘causes’, have some influence or impact on other properties – ‘outcomes’, ‘dependent variables’ or ‘effects’. The notion of causality is extremely complex and is considered in detail in Chapter 10.
Behavioural, cognitive and some demographic properties may be used in any of the three roles in research, as illustrated in Figure 1.1. Some demographic properties, however, are difficult to conceive as being ‘effects’, for example trying to ‘explain’ a person’s gender or age! Some properties may be used in more than one role in a piece of research. Thus some demographics may be used for both structural and analytic purposes, for example using age to describe the sample of respondents and also using it to see how far it ‘explains’ variation in one or more of the other properties. Some characteristics may be used by researchers as both cause and effect in the same piece of research. Thus customer satisfaction may be seen both as a result of a customer’s prior expectations about the product or service (it is an effect) and in turn as causing or influencing repeat purchase behaviour (it is also a cause).
Figure 1.1 Attributes and roles of properties in research
Properties are, in effect, researcher constructs: they are what the researcher has defined them to be. Either they are the deliberate creation of researchers who have decided how, where and when the assessment of properties are to take place, or researchers accept the constructions of other individuals, taking them as appropriate for their own research. In some instances the degree of ‘construction’ is limited, as in recording the gender of a respondent in a survey, although even here there may be the odd discrepancy between observed gender and self-reported gender. In other situations, recording a person’s social class may be the result of a highly complex process. Such processes may be referred to as ‘measurement’, ‘scaling’ or creating ‘operational definitions’. They are the means by which researchers create their ‘yardsticks’ for categorizing, counting or calibrating the values to be recorded. This may be achieved in one of four main ways:
directly;
indirectly;
deriving from two or more separate measures;
creating a multidimensional profile.
The values of some properties may be directly observable by the researcher, for example the number of individuals in a group. Sometimes a construct is redefined so that there is a one-to-one correspondence between the construct and what can be observed or recorded. Social class might, for example, be defined as perceived social class so that respondents in a survey can be asked what social class they think they are and answers are taken at face value as a ‘true’ record of respondents’ perceptions.
This is fine if, as researchers, what we wish to measure is perceived health status, perceived likelihood of drinking alcohol in the next year, or self-defined social class; however, different individuals will define these in different ways. Furthermore, for various reasons, the respondent may give a ‘wrong’ answer, for example because he or she has incorrect recall, has misinformation, is exaggerating or fabricating. The respondent’s answers are also likely to be affected by mood, situational factors, willingness or reluctance to impart feelings or information, the wording of the question, the way it was addressed, or the understanding of the question.
In these circumstances, researchers may seek more ‘objective’ measures – ones that are more readily observable or recordable and are more likely to be comparable across respondents. Researchers may, for example, take an indicator of the concept rather than the concept itself. Gross national product is commonly taken as an indicator of a country’s wealth; repeat purchase may be taken as an indicator of brand loyalty. Indirect measurement assumes that there is a degree of correspondence between the concept and the indicator deployed, but recognizes that the indicator is not the concept itself, only a reflection of it. Such measurement depends on the presumed relationships between observations and the concept of interest.
With concepts as complex as health status, social class or academic ability, asking just one question of respondents or taking just one measure of a nation’s wealth may be insufficient. Such concepts will tend to have several dimensions, aspects or facets. Each is then used to derive an overall measure. This may involve adding up recorded values and then taking an average, it might entail subtracting one value from another to derive differences, or it may mean using more complex statistical techniques. One of the most commonly used methods of derived measurement in the social sciences that is used to measure attitudes is the summated rating scale. A rating is an ordered classification of a grade given by a respondent in a survey, such as ‘Excellent’, ‘Good’, ‘Fair’, ‘Poor’, ‘Very poor’. In order to be able to add up ratings for several aspects, a numerical value is assigned to each category, for example 5, 4, 3, 2 and 1. These can now be totalled to give an overall score.
Suppose 150 respondents in a survey are asked to rate their level of satisfaction with five aspects of a service from ‘Very satisfied’ to ‘Very dissatisfied’ and values are allocated as illustrated in Figure 1.2. Total scores can now be added up. The maximum score a customer can give is 5 on each aspect, totalling 25. The minimum total is 5. These totals can then be divided by five to give an average value for each case.
Figure 1.2 A summated rating scale
A particular version of a summated rating scale to measure attitudes was developed by Likert in 1932. Likert scales are based on getting respondents to indicate their degree of agreement or disagreement with a series of statements about the object or focus of the attitude. Usually, these are on five-point ratings from ‘Strongly agree’, through ‘Agree’, ‘Neither agree nor disagree’, ‘Disagree’ to ‘Strongly disagree’. Likert’s main concern was with single dimensionality, that is, making sure that all the items would measure the ‘same’ thing. Accordingly, he recommended a series of steps:
1 A large list of attitude statements, both positive and negative, concerning the object of the attitude is generated, usually based on the results of qualitative research.
2 The response categories are given codes, typically 5 for ‘Strongly agree’ down to 1 for ‘Strongly disagree’ (these may need to be reversed for negative statements). The assigned codes are then treated as numerical values.
3 The list is tested on a screening sample of 100–200 respondents representative of the larger group to be studied and a total is derived for each respondent by adding up the values.
4 Statements that do not discriminate (i.e. everybody gives the same or similar answers), or that do not correlate with the total, are discarded. This is a procedure Likert called ‘item analysis’ and it avoids cluttering up the final scale with items that are either irrelevant or inconsistent with the other items.
5 The remaining statements, such as the ones in Figure 1.3, are then administered to the main sample of respondents, usually as part of a wider questionnaire survey. The items in Figure 1.3 were generated by ‘converting’ the items in Figure 1.2 into a set of Likert items.
6 Totals are derived for each respondent. These totals can be used in a variety of ways that are explained in Chapter 6.
Figure 1.3 A Likert scale
There are a number of fairly fundamental problems with the Likert scale, and indeed all summated rating scales:
The totals for each respondent may be derived from very different combinations of response. Thus a score of 15 may be derived either by neither agreeing nor disagreeing with all the items or by strongly agreeing with some and strongly disagreeing with others. Consequently, it is often a good idea also to analyse the patterns of each response on an item-by-item basis.
The derived totals are in no sense absolute; they only show relative positions. There are no ‘units’ of agreement or disagreement, while often, as in this example, the minimum score is not 0 so that a respondent scoring 20 is not ‘twice’ as favourable as another scoring 10. All we can really say is that a score of 20 is ‘higher’ than a score of 10 or 15 or whatever.
The screening sample and subsequent item analysis are often omitted by researchers who simply generate the statements, probably derived from or based on previous tests, and go straight to the main sample. This is in many ways a pity, since leaving out scale refinement and purification will result in more ambiguous, less valid and less reliable instruments.
The process of summating the ratings is potentially imposing a number system that forces metric characteristics (see ‘values’ in the next section) onto concepts that may not inherently possess these characteristics.
Such scales assume that individuals lie along a single dimension from positive to negative.
The analysis of data from summated rating scales can be quite complex, yet is seldom discussed in books on research methodology or data analysis. It involves using a range of univariate, bivariate and, sometimes, multivariate techniques, which are considered in Chapters 4–6 of this book. For a specific discussion of an example of analysing such data, see Kent (2007: 323–8).
While derived measures create a single total for each case, multidimensional models, by contrast, allow for the possibility not only that there is more than one characteristic underlying a set of observations, but also that these cannot be summed or transposed into a derived score. One possibility is to generate a profile of each dimension which is described separately in order to present a more complete picture. Ratings can be used to calculate an average across cases separately for each item, so that, for Figure 1.3, for example, there would be an average score for I get through very quickly and another for I always get the right person, and so on. There would be no attempt to add up scores for the five items. A more common way of obtaining a profile is to use a semantic differential. These measures were developed by Osgood et al. (1957) and were designed originally to investigate the underlying structure of words, but have subsequently been adapted to measure, for example, images of organizations or the services they offer. They present characteristics as a series of opposites, which may be either bipolar, like ‘sweet’ through to ‘sour’, or monopolar, like ‘sweet’ through to not ‘sweet’. Respondents may be asked to indicate, usually on a seven-value rating, where between the two extremes their views lie, as illustrated in Figure 1.4. The ratings are then given numerical values of 1–7 and treated as if they are metric, allowing an average to be calculated separately for each item across the respondents. It is then possible, for example, to compare profiles of two or more organizations in a ‘snake’ diagram as in Figure 1.5.
Figure 1.4 Profiling: a semantic differential
Figure 1.5 A snake diagram
An alternative to profiling is to locate each case as a single point in multidimensional space. What is known as multidimensional scaling (often referred to as MDS for short, or as perceptual mapping) refers in fact to a series of techniques that help the researcher to identify key characteristics underlying respondents’ evaluations. Such techniques attempt to deduce the underlying dimensions from a series of similarity or preference judgements of objects, products, services, organizations, and so on made by respondents. MDS is explained in more detail in Chapter 6.
Table 1.1 The variables used in the alcohol marketing study