Читать книгу Handbook of Web Surveys - Jelke Bethlehem - Страница 29
1.3.1 BLAISE
ОглавлениеThe historic developments with respect to surveys as described in the previous section took also place in the Netherlands. Particularly the rapid developments in computer technology have had a major impact on the way Statistics Netherlands collected its data. Efforts to improve the collection and processing of survey data in terms of costs, timeliness, and quality have led to a powerful software system called Blaise. This system emerged in the 1980s, and it has evolved over time so that it is now also able to conduct web surveys and mixed‐mode surveys. The section gives an overview of the developments at Statistics Netherlands leading to Internet version of Blaise.
The advance of computer technology since the late 1940s led to many improvements at Statistics Netherlands for conducting its surveys. For example, from 1947 Statistics Netherlands started using probability samples to replace its complete enumerations for surveys on income statistics and agriculture. The implementation of sophisticated sampling techniques such as stratification and systematic sampling is much easier and less labor intensive on a computer than manual methods.
Collecting and processing statistical data was a time‐consuming and expensive process. Data editing was an important component of this work. The aim of these data editing activities was to detect and correct errors in the individual records, questionnaires, or forms. This should improve the quality of the results of surveys. Since statistical offices attached much importance to this aspect of the survey process, a large part of human and computer resources were spent on it.
To obtain more insight into the effectiveness of data editing, Statistics Netherlands carried out a Data Editing Research Project in 1984. Bethlehem (1987) describes how survey data were processed. The overall process included manual inspection of paper forms, preparation of the forms for high‐speed data entry including correcting obvious errors or following up with respondents, data entry, and further correction.
The Data Editing Research Project discovered a number of problems:
Various people from different departments were involved. Many people dealt with the information: respondents, subject‐matter specialists, data typists, and computer programmers.
Transfer of material from one person/department to another could be a source of error, misunderstanding, and delay.
Different computer systems were involved from mainframe to minicomputers to desktop computers under MS‐DOS. Transfer of files from one system to another caused delay, and incorrect specification and documentation could produce errors.
Not all activities were aimed at quality improvement. Time was also spent on just preparing forms for data entry, and not on correcting errors.
The cycle of data entry, automatic checking, and manual correction was in many cases repeated three times or more. Due to these cycles, data processing was very time consuming.
The structure of the data (the metadata) had to be specified in nearly every step of the data editing process. Although essentially the same, the “language” of this metadata specification could be completely different for every department or computer system involved.
The conclusions of the Data Editing Research Project led to general redesign of the survey processes of Statistics Netherlands. The idea was to improve the handling of paper questionnaire forms by integrating data entry and data editing tasks. The traditional batch‐oriented data editing activities, in which the complete data set was processed as a whole, were replaced by a record‐oriented process in which each record (form) was completely dealt with in one session.
More about the development of the Blaise system and its underlying philosophy can be found in Bethlehem and Hofman (2006).
The new group of activities was implemented in a so‐called CADI system. CADI stands for computer‐assisted data input. The CADI system was designed for use by the workers in the subject‐matter departments. Data could be processed in two ways by this system:
Heads‐up data entry. Subject‐matter employees worked through a pile of forms with a microcomputer, processing the forms one by one. First, they entered all data on a form, and then they activated the check option to test for all kinds of errors. Detected errors were reported on the screen. Errors could be corrected by consulting forms or by contacting the suppliers of the information. After elimination of all errors, a “clean” record was written to file. If employees could not produce a clean record, they could write the record to a separate file of “dirty” records to deal with later.
Heads‐down data entry. Data typists used the CADI system to enter data beforehand without much error checking. After completion, the CADI system checked in a batch run all records and flagged the incorrect ones. Then subject‐matter specialists handled these dirty records one by one and correct the detected errors.
To be able to introduce CADI on a wide scale in the organization, a new standard package called Blaise was developed in 1986. The basis of the system was the Blaise language, which was used to create a formal specification of the structure and contents of the questionnaire.
The first version of the Blaise system ran on networks of microcomputers under MS‐DOS. It was intended for use by the people of the subject‐matter departments; therefore no computer expert knowledge was needed to use the Blaise system.
In the Blaise philosophy, the first step in carrying out a survey was to design a questionnaire in the Blaise language. Such a specification of the questionnaire contains more information than a traditional paper questionnaire. It did not only describe questions, possible answers, and conditions on the route through the questionnaire but also relationships between answers that had to be checked.
Figure 1.6 contains an example of a simple paper questionnaire. The questionnaire contains one route instruction: persons without job are instructed to skip the questions about the type of job and income.
Figure 1.7 contains the specification of this questionnaire in the Blaise system. The first part of the questionnaire specification is the Fields section. It contains the definition of all questions that can be asked. A question consists of an identifying name, the text of the question as presented to the respondents, and a specification of valid answers. For example, the question about age has the name Age, the text of the question is “What is your age?” and the answer must be a number between 0 and 99.
Figure 1.6 A simple paper questionnaire
Figure 1.7 A simple Blaise questionnaire specification
The question JobDes requires a text not exceeding 20 characters. Income is a closed question. There are three possible answer options. Each option has a name (for example, Less20) and a text for the respondent (for example, “Less than 20,000”).
The second part of the Blaise specification is the Rules section. Here, the order of the questions is specified and the conditions under which they are asked. According to the rules section in Figure 1.7, every respondent must answer the questions SeqNum, Age, Sex, MarStat, and Job in this order. Only persons with a job (Job = Yes) have to answer the questions JobDes and Income.
The rules section can also contain checks on the answers of the questions. Figure 1.7 contains such a check. If people are younger than 15 years (Age < 15), then their marital status can only be not married (MarStat = NotMar). The check also contains texts that are used to display the error message on the screen (If respondent is younger than 15 then he/she is too young to be married!).
The rules section may also contain computations. Such computations could be necessary in complex routing instructions or checks or to derive new variables.
The first version of Blaise used the questionnaire specification to generate a CADI program. Figure 1.8 shows what the computer screen of this MS‐DOS program looked like for the Blaise questionnaire in Figure 1.7.
Figure 1.8 A Blaise CADI program
Since this program was used by subject‐matter specialists, only question names are shown on the screen shown in Figure 1.8. Additional information could be displayed through special keys. Note that the input fields for the questions Age and MarStat contain error counters. These error indicators appeared because the answers of the questions Age (2) and MarStat (Married) did not pass the check.
After Blaise had been in use for a while, it was realized that such a system could be made much more powerful. The questionnaire specification in the Blaise system contained all knowledge about the questionnaire and the data needed for survey processing. Therefore, Blaise should be capable to handle CAI.
Implementing CAI means that the paper questionnaire is replaced by a computer program containing the questions to be asked. The computer takes control of the interviewing process. It performs two important activities:
Route control. The computer program determines which question is to be asked next and displays that question on the screen. Such a decision may depend on the answers to previous questions. As a result, it is not possible anymore to make route errors.
Error checking. The computer program checks the answers as data are entered. Range checks are carried out immediately, as well as consistency checks after entry of all relevant answers. If an error is detected, the program produces an error message, and data must be corrected.
Use of computer‐assisted data collection has three major advantages. First, it simplifies the work of interviewer (for example, no more route control). Second, it improves the quality of the collected data. Third, data are entered in the computer during the interview resulting in a complete and clean record.
Figure 1.9 A Blaise CAPI program
Version 2 of Blaise was completed in 1988. It implemented CAPI. This is a form of face‐to‐face interviewing in which interviewers use a laptop computer to conduct the interview.
Figure 1.9 shows an example of a screen of a CAPI program generated by Blaise. The screen was divided in two parts. The upper part contains the current question to be answered (What kind of a job do you have?). After an answer had been entered, this question was replaced by the next question on the route.
Just displaying one question at the time gave the interviewers only limited feedback on where they are in the questionnaire. Therefore, the lower part of the screen displayed (in a very compact way) the current page of the questionnaire.
Statistics Netherlands started full‐scale use of CAPI in regular survey in 1987. The first CAPI survey was the Labor Force Survey. Each month, about 400 interviewers equipped with laptops visited 12,000 addresses. After a day of interviewing, the laptop was connected to a telephone modem. The data were transmitted to the office at night. In return, new addresses were sent to the interviewers. The next morning the laptop was prepared for a new day of interviewing.
CATI was introduced in 1990 on desktop computers. Interviewers called respondents from a central unit (call center) and conducted interviews by telephone. The interviewing program for CATI was the same as that for CAPI. An important new tool for CATI was a call scheduling system. This system took care of proper delivering busy numbers (try again shortly), no answers (try again later), appointments, etc.
By the very early 1990s, nearly all household surveys of Statistics Netherlands had become CAPI or CATI surveys. Surveys using paper forms had almost become extinct. Table 1.3 lists all major and regular household surveys at that time together with their mode of interviewing.
Table 1.3 Household surveys carried out by Statistics Netherlands in the early 1990s
Survey | Mode | Interviews per year |
---|---|---|
Survey on Quality of Life | CAPI | 7,500 |
Health Survey | CAPI | 6,200 |
Day Recreation Survey | CAPI | 36,000 |
Crime Victimisation Survey | CAPI | 8,000 |
Labour Force Survey | CAPI | 150,000 |
Car Use Panel | CATI | 8,500 |
Consumer Sentiments Survey | CATI | 24,000 |
Social‐Economic Panel | CATI | 5,500 |
School Career Survey | CATI | 4,500 |
Mobility Survey | CATI/CADI | 20,000 |
Budget Survey | CADI | 2,000 |
In the middle of the 1990s, the MS‐DOS operating system on microcomputers was replaced by Windows. This marked the start of the use of graphical user interfaces. Early versions of the Internet browser Internet Explorer were included in this operating system.
Blaise 4 was the first production version of Blaise for Windows released in 1998. When more and more people and companies were connected to the Internet, web surveys became a popular mode of data collection among researchers. The main reasons of this popularity were the high response speed, the possibility to provide feedback to respondents about the meaning of questions and possible errors, and the freedom for the respondents to choose their own moment to fill in the questionnaire.
The graphical user interface offered many more possibilities for screen layout. Figure 1.10 gives an example of a screen of the Blaise 4 CAPI program.
Since respondents are familiar with browsers from all their other activities on the Internet, there was no need to explain the graphical user interface.
The possibility to conduct web surveys was included in version 4.6 of Blaise released in 2003. The respondent completes the questionnaire online allowing continuous interaction between the computer of the respondent and the software on the Internet server.
The Internet questionnaire is divided into pages. Each page may contain one or more questions. After the respondent has answered all questions on a page, the answers are submitted to the Internet server. The answers are checked; a new page is returned to the respondent. The contents of this page may depend on the answers to previous questions.
Figures 1.11 and 1.12 show an example of the same page of a web survey when using Blaise 5. In this case, the page contains only one question. The first page will be displayed when using a tablet, and the second page will be displayed when using a smartphone.
Figure 1.10 The screen of a CAPI program in Blaise 4
Figure 1.11 The screen of a Blaise 5 web survey on a tablet
Figure 1.12 The screen of a Blaise 5 web survey on a smartphone
The Blaise 5 system implements a number of source code features (Languages, Modes, Roles, and SpecialAnswers) that specifically address challenges listed above. It also implements a cross‐platform layout designer, templates, and cross‐platform settings that handle presentation and operability issues. Finally, Blaise 5 allows the institute to combine these features in as many ways as suits its survey program and population.