Читать книгу Managing Data Quality - Tim King - Страница 28

Оглавление

Managing Data Quality

12

In contrast, data in an enterprise context will often support multiple business processes. In such circumstances, an item of data will have to comply with multiple requirements simultaneously in order to be viewed as good quality data. For instance, the moment when an asset is formally commissioned needs to be known to the nearest year for long-term planning purposes, to the nearest week for maintenance planning purposes and to the nearest day for work management activities.

So, given that fitness for purpose is specified by a set of applicable requirements, the key consideration becomes identifying which characteristics of data are covered by those requirements.

Data characteristics

There have been various attempts to specify all the relevant quality characteristics of data but, in fact, none of these attempts covers a complete set of characteristics. Part of the problem is that different specialists describe data requirements from different perspectives.

The end user is mainly concerned with the ultimate effect of the data, so, for example, accuracy and completeness are key considerations.

The data modeller wants to know which attributes are mandatory for each entity (i.e. must contain a value in each data set) and which are optional.

The database administrator thinks about a data set as the tables and columns in the database. For each table, the administrator needs to know, for example, which columns are foreign keys and which column in which table contains the target of the foreign key.

These perspectives are brought together by ISO 8000-8, which builds on fundamental computer science to create a definitive overall framework for the characteristics and requirements of data. This framework identifies the three types of data quality as being:

syntactic (i.e. the correct format for the data);

semantic (i.e. the consistent common interpretation of the data);

pragmatic (i.e. the data will be useful to intended recipients).

These three types can appear to be abstract, so a more popular approach is to work with data quality dimensions. Again, many different lists exist of such dimensions and none is perfect, but we find this one most useful (DAMA UK 2013):

accuracy;

completeness;

consistency;

validity;

timeliness;

uniqueness.

Managing Data Quality

Подняться наверх