Читать книгу Introduction to Python Programming for Business and Social Science Applications - Frederick Kaefer - Страница 28

General Social Survey (GSS) Data Set


The General Social Survey has over 5,000 variables collected over a period of more than 40 years. You can explore the data online using a data explorer or download the complete data sets (http://www.gss.norc.org/Get-The-Data). Table 1.4 presents a subset of fields from the GSS and their meaning as described in the GSS Codebook (Smith, Davern, Freese, & Hout, 1972–2016), which is available at http://gss.norc.org/get-documentation.

Table 1.4

Table 1.5

The sample data shown in Table 1.5 is in ascending order of the ID value for each record. Unlike the Trip_ID in the Taxi Trips data set, the ID value is not unique in the GSS data, as we can see by the duplication of both ID 10 and ID 16 in the data in Table 1.5. In the GSS data, it is the combination of the YEAR and ID fields that is unique (we call using several fields to uniquely identify a record in a data set a composite identifier or composite key). For example, the respondent with ID 10 in YEAR 1990 is not the same as the respondent with ID 10 in YEAR 1991. Another important difference is that the data in the GSS all appear to be numeric; however, the values are not all quantitative. For example, the values for HAPPY are coded responses to a survey where 1 = very happy, 2 = pretty happy, and 3 = not too happy. Another important point is that the values for REALINC are not actually continuous (even though they might appear to be) but are discrete. These values correspond to the midpoints of income ranges specified in a survey, and the values prior to 1986 have been recoded in six-digit numbers and converted to 1986 dollars (Ligon, 1994; Smith et al., 1972–2016).

Lessons learned: In this section, we learned about the Chicago Taxi Trips and General Social Survey data sets, which will use throughout the text.

Introduction to Python Programming for Business and Social Science Applications

Подняться наверх