Читать книгу Managing Data Quality - Tim King - Страница 25
ОглавлениеThe data asset
9
that a particular segment of the data is only partially complete. This knowledge should inform the analysis process, but it is also a trigger for the next step.
Improve data: Greater awareness of the quality of existing data or changes to business requirements can be the trigger to gather new data or improve existing data.
Synthesis: The activity of data exploitation can create new, synthesised data that warrant storage for future utilisation. For instance, this could be performance statistics for each day, which are stored to enable time-series analysis. Forms of synthesis can include inference and extrapolation to allow missing data to be determined; for example, estimating the age of a main water supply pipe based on the age of the properties on a particular street.
Archive: Some data are no longer required for immediate access, but need to be retained for legal or regulatory compliance purposes; so, various offline storage methods can be used to keep the data, accepting that there could be some delay between wanting access to the data and them becoming available.
Delete: Ultimately, some data will have no further purpose or benefit, so can be considered for permanent deletion. An example of this could be the full audit trail for all transactions on a system that will not be required many years after the transactions occurred.
There are many types of document that can exist in an organisation, with varying levels of importance and differing requirements for retention. These can include:
organisational policies, strategies and standards requiring formal approval and version control;
contracts and legal documents requiring retention until all possible consequences have been exhausted;
design, construction and maintenance documents requiring retention until the physical asset no longer exists;
personnel records requiring retention in line with legal and regulatory stipulations;
project and team working documents requiring less rigorous control and management, but are useful for day-to-day activity within the organisation.
The life cycle for documents (which can be referred to as semi-structured data) has a number of areas of difference, particularly for documents stored in a formal electronic document management system (EDMS) and consists of eight stages, as shown in Figure 1.3.
These life cycle stages are as follows:
Create: When a text-based document is created and stored in a document management system, a range of metadata will also be stored about the document; for example, the author, creation date and security classification.