Читать книгу Handbook of Web Surveys - Jelke Bethlehem - Страница 11
1.2 Theory 1.2.1 THE EVERLASTING DEMAND FOR STATISTICAL INFORMATION
ОглавлениеThe history of data collection for statistics goes back in time for thousands of years. As far back as Babylonian era, a census of agriculture was carried out. This already took place shortly after the invention of the art of writing. The same thing happened in China. This empire counted its people to determine the revenues and the military strength of its provinces. There are also accounts of statistical overviews compiled by Egyptian rulers long before Christ. Rome regularly took censuses of people and of property. The collected data were used to establish the political status of citizens and to assess their military and tax obligations to the state.
Censuses were rare in the Middle Ages. The most famous one was the census of England taken by the order of William the Conqueror, King of England. The compilation of his Domesday Book started in the year 1086 AD. The book records a wealth of information about each manor and each village in the country. Collected information was about more than 13,000 places. More than 10,000 facts were recorded for each country.
To collect all this data, the country was divided into a number of regions. In each region, a group of commissioners was appointed from among the greater lords. Each county within a region was dealt with separately. Sessions were organized in each county town. The commissioners summoned all those required to appear before them. They had prepared a standard list of questions. For example, there were questions about the owner of the manor; the number of free man and slaves; the area of woodland, pasture, and meadow; the number of mills and fishponds, to the total value; and the prospects of getting more profit. The Domesday Book still exists, and many county data files are available on CD‐ROM and the Internet.
Another interesting example of the history of official statistics is in the Inca Empire that existed between 1000 and 1500 AD. Each Inca tribe had its own statistician, called the quipucamayoc. This man kept records of the number of people, the number of houses, the number of llamas, the number of marriages, and the number of young men that could be recruited for the army. All these facts recorded on quipus, a system of knots in colored ropes. A decimal system was used for this. At regular intervals, couriers brought the quipus to Cusco, the capital of the kingdom, where all regional statistics were compiled into national statistics. The system of quipucamayocs and quipus worked remarkably well. The system vanished with the fall of the empire.
An early census also took place in Canada in 1666. Jean Talon, the intendant of New France, ordered an official census of the colony to measure the increase in population since the founding of Quebec in 1608. Name, age, sex, marital status, and occupation were recorded for every person. It turned out there lived 3,215 people in New France.
The first censuses in Europe took place in the Nordic countries. The first census in Sweden–Finland took place in 1749. Not everyone welcomed the idea of a census. Particularly religious people believed that people should not be counted. They referred to the census ordered by King David in biblical times, which was interrupted by a terrible plague and never completed. Others said that a population count would reveal the strengths and weaknesses of a country to foreign enemies. Nevertheless, censuses took place in more and more countries. The first census in Denmark–Norway has been in 1769. In 1795, at the time of the Batavian Republic under Napoleon's influence, the first count of the population of the Netherlands took place. The new centralized administration wanted to gather quantitative information to devise a new system of electoral constituencies (see Den Dulk and Van Maarseveen, 1990).
In the period until the late 1880s, there were some applications of partial investigations. They were statistical inquiries in which only part of a complete human population has been interviewed. The way the persons were selected from the population was generally unclear and undocumented.
In the second half of the 19th century, so‐called monograph studies became popular. They were based on Quetelet's idea of the average man. According to Quetelet, many physical and moral data have a natural variability. This variability can be described by a normal distribution around a fixed, true value. He assumed the existence of something called the true value. Quetelet introduced the concept of average man (“l'homme moyenne”) as a person of which all characteristics were equal to the true value (see Quetelet, 2010, 2012).
The period of the 18th and 19th centuries is called the era of the Industrial Revolution, too. It led to important changes in society, science, and technology. Among many other things, urbanization started from industrialization and democratization. All these developments created new statistical demands. The foundations for many principles of modern statistics were laid. Several central statistical bureaus, statistical societies, conferences, and journals, were established soon after this period. First ideas about survey sampling emerged in the world of official statistics. If a starting year must be chosen, 1895 would be a good candidate. Anders Kiaer, the founder and first director of Statistics Norway, started in this year a fundamental discussion about the use of sampling methods. This discussion led to the development, acceptance, and application of sampling as a scientific method.
Anders Kiaer (1838–1919) was the founder and advocate of the survey method that is now widely applied in official statistics and social research. With the first publication of his ideas in 1895, he started the process that ended in the development of modern survey sampling theory and methods. This process is described in more detail in Bethlehem (2009).
There have been earlier examples of scientific investigations based on samples, but they were lacking proper scientific foundations. The first known attempt of drawing conclusions about a population using only information about part of it was made by the English merchant John Graunt (1662). He estimated the size of the population of London. Graunt surveyed families in a sample of parishes where the registers were well kept. He found that on average there were three burials per year in 11 families. Assuming this ratio to be more or less constant for all parishes and knowing the total number of burials per year in London to be about 13,000, he concluded that the total number of families was approximately 48,000. Putting the average family size at 8, he estimated the population of London to be 384,000. Since this approach lacked a proper scientific foundation, John Graunt could not say how accurate his estimates were.
More than a century later, the French mathematician Pierre‐Simon Laplace realized that it was important to have some indication of the accuracy of his estimate of the French population. Laplace (1812) implemented an approach that was more or less similar to that of John Graunt. He selected 30 departments distributed over the area of France in such a way that all types of climate were represented. Moreover, he selected departments in which accurate population records were kept. Using the central limit theorem, Laplace proved that his estimator had a normal distribution. Unfortunately, he disregarded the fact that sampling was purposively, and not at random. These problems made application of the central limit theorem at least doubtful.
In 1895 Anders Kiaer (1895, 1997), the founder and first director of Statistics Norway, proposed his representative method. It was a partial inquiry in which a large number of persons were questioned. Selection of persons was such that a “miniature” of the population was obtained. Anders Kiaer stressed the importance of representativity. He argued that if a sample was representative with respect to variables for which the population distribution was known, it would also be representative with respect to the other survey variables. Example 1.1 describes the Kiaer's experiment about the representative method.