Читать книгу Innovations in Digital Research Methods - Группа авторов - Страница 38

2.1.2 What is Data?

Оглавление

Data is information or knowledge about an individual, object or event. Data can comprise numerical values, quantities of text, sounds or images, memories or perceptions. Often the concept of data suggests information that has a structure and which has been through some kind of processing.

Many examples of new types of data have very different and sometimes unstructured formats, for example, tweets or documents released under a Freedom of Information (FOI) request. In order to develop our understanding of the changing data environment, we outline below a typology of different data types. This typology is based on the idea of data as knowledge but also in terms of each data item carrying with it implicit or explicit metadata, that is, data about the data item, such as its origin, ownership, terms of use and coverage. There are a variety of ways to consider the nature of data but here we combine the key issues into a single framework. We draw on work by Elliot et al. (2010) on behalf of the Office for National Statistics (ONS) in the UK, which examined the nature of public data, comparing information that is formally in the public domain, such as public administrative records (e.g., the Electoral Register, share holdings and professional occupation lists) and data that is informally in the public domain, such as that posted on the Internet (e.g., via Facebook and blogs). For a related discussion of what they term datafication, which refers to the process of recording and quantifying behaviour and events for analysis, see Mayer-Schönberger and Cukier (2013: 73).

We develop our approach here to focus on what can be termed the ‘metadata of origin’, rather than the actual type of data or whether the data is qualitative or quantitative. The issue of origin is interdependent with issues of data ownership, quality, access and use. A key aspect of this is the law and codes of practice around the recognition of what is ‘personal’ data. Under the UK Statistics and Registration Service Act (2007) (SRSA) personal information is defined as information which relates to and identifies a particular person (including a body corporate)’. Information identifies a particular person if the identity of that person – ‘(a) is specified in the information, (b) can be deduced from the information, or (c) can be deduced from the information taken together with any other published information’.5 The disclosure of personal information by public bodies, such as the ONS, is a criminal offence. For further information see the UK Anonymization Network6 and also a recent report by the Information Commissioner (ICO, 2012).

In terms of the metadata of origin approach, we propose an eight-point typology based on the type of generation process involved. Given the complexity and changing nature of the data environment, it can be argued that mapping the data generation process is the only stable way of understanding the variety of data and for developing good practice around the use of different data types.

Innovations in Digital Research Methods

Подняться наверх