Читать книгу Bioinformatics and Medical Applications - Группа авторов - Страница 19
1.3.1 Description of Dataset
ОглавлениеThe source of data is Kaggle dataset for cardiovascular diseases which contains 70,000 records with patient information. The attributes include objective information, subjective information, and results of medical examination. Table 1.2 enumerates the 12 attributes.
A heatmap is a clear representation of data where data values are represented as colors. It is used to get a clear view of the relationship between the features. The coefficient of relationship is a factual proportion of the strength of the association between the general developments of two factors with values going between −1.0 and 1.0. A determined number more prominent than 1.0 or less than −1.0 indicates a slip-up in the relationship estimation. Figure 1.1 represents the heat map for the input parameters of the defined dataset.
Table 1.2 Dataset attributes.
Feature name | Variable name | Value type |
Age | Age | No. of days |
Height | Height | Centimeters |
Weight | Weight | Kilograms |
Gender | Gender | Categories |
Systolic blood pressure | Ap_hi | Integer |
Diastolic blood pressure | Ap_lo | Integer |
Cholesterol | Cholesterol | 1: Standard; 2: Above standard; 3: Well above standard. |
Glucose | Glu | 1: Standard; 2: Above standard; 3: Well above standard. |
Smoking | Smoke | Dual |
Alcohol intake | Alco | Dual |
Physical activity | Active | Dual |
Presence or absence of CVDs | cardio | Dual |
Figure 1.1 Heatmap of input attributes.
Figures 1.2, 1.3, 1.4, and 1.5 display the distribution of some of the input values such as age, gender, presence of cardiovascular disease, and cholesterol type.