Читать книгу Computational Intelligence and Healthcare Informatics - Группа авторов - Страница 43
2.2.2 Performance Measuring Parameters
ОглавлениеThere are various parameters used by researchers to evaluate performance of their models, namely, ROC, AUC, F1-score, recall, accuracy, specificity, and sensitivity. Use of metric varies from application to application, and sometimes, single parameter does not justify the working of models. In such cases, subset of parameters are used to evaluate performance of model. For testing the accuracy of classification models, accuracy, precision, recall, F1-score, sensitivity, specificity, ROC curve, and AUC curve are utilized on the basis of which comparison is performed in the existing research. These parameters are discussed as follows.
1 Accuracy: It is defined as the ratio of number of correctly classified chest pathologies to the number of available samples. If the dataset has 1,000 images having some pathology or no-pathology and the model has correctly classified 820 pathologies and 10 as normal cases, then accuracy will be 830×100/1,000 = 83%.
2 Precision: When there is imbalance in dataset with respect to availability of class-wise images, then accuracy is not an acceptable parameter. In such cases, model is tuned to major class only which does not make sense. For example, if in CXR dataset, images belonging to class nodule are more, then predicting maximum images of most frequent class will not solve the purpose. Therefore, class specific metric is needed known as precision.
3 Recall: It is also a class specific parameter. It is a fraction of images correctly classified for given class.
4 F1-Score: For critical application, F1-score is needed which will guide which parameter is more appropriate, i.e., precision or recall. Sometime, both metrics are equally important. Therefore, F1-score is a combination of both which is calculated as:
5 Sensitivity and Specificity: These metrics are generally used for medical application which are calculated as follows.
6 ROC Curve: It is a Receiver Operating Characteristic curve generally used to measure the performance of binary classifier model. This curve is plotted as True-Positive Rate against False-Positive Rate. It depicts overall performance of model and helps in selecting good cut-off threshold for the model.
7 AUC Curve: Area Under Curve (AUC) is a binary classifier’s aggregated output metric for all possible threshold values (and therefore it is threshold invariant). Under the ROC curve, AUC calculates the field, and hence, it is between 0 and 1. One way to view AUC is as the likelihood that a random positive example is ranked more highly by the model than a random negative example.