Читать книгу Biomedical Data Mining for Information Retrieval - Группа авторов - Страница 18

1.3 Materials and Methods 1.3.1 Dataset

Оглавление

The dataset is collected from PhysioNet Challenge 2012 which consists of three sets A, B and C [6]. A total of 12,000 patient records are available. Each set consists of 4,000 records of patients from which only set A dataset of 4,000 records are used in this chapter for simulation. There are 41 variables recorded in dataset, five of these variables (age, gender, height, ICU type and initial weight) are general descriptors and 36 variables are times series variables as described in Table 1.1.

From the above 36 variables, only 15 variables are selected for mortality prediction. These variables are represented below in Table 1.2.

From these 15 variables, first value, last value, highest value, lowest value and median value are calculated for nine variables and taken as features. Only first and last values are taken for four variables. For the dataset A, five outcome-related descriptors (SAPS Score, SOFA Score, length of stay, length of survival and in-hospital death) are available from which inhospital death (0 is represented as a survivor and 1 is represented as died in hospital) is taken as a target value.

Biomedical Data Mining for Information Retrieval

Подняться наверх