Читать книгу Computational Intelligence and Healthcare Informatics - Группа авторов - Страница 44
2.2.3 Availability of Datasets
ОглавлениеMostly, deep learning models used for identification of chest radiography pathologies and training processes are carried out on the basis of available CXR datasets, of which the most famous datasets are the Indiana dataset [15], KIT dataset [54], MC dataset [29], Japanese Society of Radiological Technology (JSRT) dataset [59], ChestX-ray14 dataset [19], NIH Tuberculosis Chest X-ray [17], and Belarus Tuberculosis [6]. There are major limitations to each cited dataset, some of which are addressed in a survey published in August 2018 [48]. The ChestX-ray14 is recognized as one of the most widespread CXR datasets among the available datasets, which contains 108,948 x-ray images obtained from 32,717 patients. These images are labeled by means of natural language processing with one or more diagnostic labels. A number of recent AI reports, such as Wang et al. [70], Yao et al. [71], Rajpurkar et al. [49], and Guan et al. [4], have used ChestX-ray14 dataset. All of these studies are trained and tested on ChestX-ray14 dataset accompanied with annotation for 14 different types of chest pathologies. Details of availability of number of each type of pathology in ChestX-ray14 dataset are shown in Table 2.1 [7].
It is observed that there is a presence of disproportion in the number of available images among 14 chest pathologies. This is one of the factors affecting performance of different deep models. Before analyzing existing models, 14 chest pathologies are described as follows in Figure 2.2.
1 Atelectasis: It is a disorder where there is no space for normal expansion of lung due to malfunctioning of air sacs in it.
2 Cardiomegaly: It is a disorder related to heart where heart enlarged due to stress or some medical condition.
3 Consolidation: When the small airways in lungs are filled with fluids like pus, water, or blood instead of air, then consolidation occurs.
4 Edema: It occurs due to deposition of excess fluid in lungs.
5 Effusion: In this disorder excess fluid filled in between chest wall and lungs.
6 Emphysema: Alveoli which are known as air sacs of lungs when damaged or get weak then person suffers with Emphysema.
7 Fibrosis: When lung tissues get thickened or stiff, then it becomes difficult for lungs to work normally. This condition is known as fibrosis.
8 Hernia: protuberance of thoracic contents outside their defined location in thorax region is known as thoracic hernia.
9 Infiltration: When there is a trail of denser substance such as pus, blood, or protein occurs within the parenchyma of the lungs, then it is known as a pulmonary infiltration.
10 Mass: It is a tumor that grows in mediastinum region of chest that separates the lungs is termed as Mass.
11 Nodule: A small masses of tissue in the lung are known as lung nodules.
12 Pleural Thickening: When the lung is exposed to asbestos, it causes lungs tissue to scar. This condition is known as pleural thickening.
13 Pneumonia: When there is an infection in air sacs of either or both lungs, then its results in Pneumonia.
14 Pneumothorax: When air leaks from lungs into the chest wall then this condition is known as Pneumothorax disorder.
Table 2.1 Details of ChestX-ray14 dataset.
Type of pathology | No. of images with label | Type of pathology | No. of images with label |
---|---|---|---|
Atelectasis | 11559 | Consolidation | 4,667 |
Cardiomegaly | 2776 | Edema | 2,303 |
Effusion | 13317 | Emphysema | 2,516 |
Infiltration | 19894 | Fibrosis | 1,686 |
Mass | 5782 | Pleural thickening | 3,385 |
Nodule | 6331 | Hernia | 227 |
Pneumonia | 1431 | Normal chest x-ray | 60,412 |
Pneumothorax | 5302 |
Figure 2.2 Types of chest pathologies.
Detection of Cardiomegaly is done by many researchers as it is a spatially spread disorder across large region and therefore easy to detect.