Читать книгу Machine Learning Techniques and Analytics for Cloud Security - Группа авторов - Страница 52

3
Selection of Certain Cancer Mediating Genes Using a Hybrid Model Logistic Regression Supported by Principal Component Analysis (PC-LR)

Оглавление

Subir Hazra*, Alia Nikhat Khurshid and Akriti

Meghnad Saha Institute of Technology, Kolkata, India

Abstract

In recent times, gene selection whose mutation is associated with some cancers is a promising research area. An important tool to progress in this research work is analyzing microarray gene expression data. Literature survey shows that different algorithms based on Machine Learning have been found effective in cancer classification and gene selection. The selected genes play a significant role as a clinical decision-making support system. It becomes helpful in diagnosing cancer by identifying genes whose expression level changes significantly. As microarray gene expression data is huge in number, so developing gene selection algorithm through Machine Learning approach incurs high computational complexity. Too many features can cause of over fitting and gives poor performance for the algorithm. In the present article, we developed a hybrid approach where we reduced number of features using Principal Component Analysis (PCA) and then applied Logistic Regression model for prediction of genes. After fitting Logistic Regression on test data, it is compared with an accuracy score. By checking the accuracy score, finally, the set of candidate genes is selected whose expression levels are manifested disproportionately. The generated sets of genes are identified for having correlation with certain cancers. The proposed method is demonstrated with two datasets, viz., colon and lung cancer. The result has been finally validated biologically using NCBI database. The efficacy and robustness of the method have also been evaluated.

Keywords: Gene expression, PCA, Logistic Regression, dimensionality reduction, accuracy score, classification, F-score

Machine Learning Techniques and Analytics for Cloud Security

Подняться наверх