Читать книгу Managing Data Quality - Tim King - Страница 36
ОглавлениеManaging Data Quality
20
Data as a raw material
When using data, many individuals assume that they can use all the data ‘as is’. This assumption is flawed, as explained by the analogy of how artisans making furniture vary their approach according to the raw materials they are using:
For a raw material such as newly manufactured metal or plastic, the product will be highly consistent with conformance or test certificates proving the quality of the material. This consistency of product means that all the material could be used, with the main challenge being how to minimise wastage.
For a raw material of seasoned wood, the product is likely to have some pieces with knots in them, be slightly warped and/or include areas of woodworm or rot. In this case, the artisan assesses each piece of wood to determine the best way to utilise it in order to make furniture, perhaps rejecting pieces that were too rotten or warped or choosing to use the more knotty pieces of wood for the back of the furniture where it is less visible/critical.
When undertaking data exploitation, data are the raw material – but are a raw material that can vary from being like a metal through to being like wood. Before undertaking any analysis using data, it is important to consider whether the data are like wood and need understanding of their quality to know which data should perhaps be ignored, which may need cleansing and which seems reliable. The aggregate assessment of the quality of the input data will help inform how you describe the confidence levels of the data outcomes.
Ideally, organisations will improve the sourcing of their data in the longer term, and look to implement the quality control and certification that ensures the data become like the metals and plastics that enable consistent manufacturing processes in this analogy.
The data machine: expectations vs reality
Acquiring, storing, managing and exploiting data within an organisation involves many activities and processes. These activities could be thought of as a machine powering the organisation. If you had to visualise the ‘data machine’ that represents your organisation, what sort of machine is it?
In a quarry, an excavator is a large (and expensive) piece of equipment that is essential for loading rock blasted from the rock face into dumper trucks for transporting to the processing plant. These are sophisticated and powerful machines that, if maintained correctly, should load rock effectively and efficiently for many years.
Implementation of an enterprise software solution can be likened to this excavator – it can provide a powerful and, arguably, efficient means to run business processes and acquire and analyse data. Similar to a major piece of equipment in a quarry, this will be a large investment for the organisation.
If you were to visualise the data processing activities of your organisation as a machine, would it be similar to this major item and be a single, efficient and effective entity? In reality, perhaps, there could be a few manual work-arounds to overcome deficiencies in certain