Читать книгу Administrative Records for Survey Methodology - Группа авторов - Страница 32
2.3 Confidentiality Protection in Linked Data: Examples
ОглавлениеTo illustrate the application of new disclosure avoidance techniques, we describe three examples of linked data and the means by which confidentiality protection is applied to each. First, the Health and Retirement Study(HRS) links extensive survey information to respondents’ administrative data from the Social Security Administration (SSA) and the Center for Medicare and Medicaid Services (CMS). To protect confidentiality in the linked HRS–SSA data, its data custodians use a combination of restrictive licensing agreements, physical security, and restrictions on model output. Our second example is the Census Bureau’s Survey of Income and Program Participation (SIPP), which has also been linked to earnings data from the Internal Revenue Service (IRS) and benefit data from the SSA. Census makes the linked data available to researchers as the SIPP Synthetic Beta File(SSB). Researchers can directly access synthetic data via a restricted server and, once their analysis is ready, request output based on the original harmonized confidential data via a validation server. Finally, the Longitudinal Employer-Household Dynamics Program (LEHD) at the Census Bureau links data provided by 51 state administrations to data from federal agencies and surveys and censuses on businesses, households, and people conducted by the Census Bureau. Tabular summaries of LEHD are published with greater detail than most business and demographic data. The LEHD is accessible in restricted enclaves, but there are also restrictions on the output researchers can release. There are many other linked data sources. These three are each innovative in some fashion, and allow us to illustrate the issues faced when devising disclosure avoidance methods for linked data.