Framework for prediction of oncogenomic progression aiding personalized treatment of carcinoma
Loading...
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
UMT, Lahore
Abstract
DNA makes up genes, and each gene has a unique sequence. A mutation is a permanent alteration to the nucleotide collection in DNA that results from recombination or replication within the genetic base and some of the mutations cause cancer. Cancer is a disease in which cells in a particular body component proliferate and replicate in an uncontrolled manner. Most of the earlier research uses images to detect cancer after symptoms start to manifest, which is a late discovery. Therefore, Numerous lives can be saved if cancer is discovered in its preliminary stages. This study proposed an Ensemble Learning (EL) model based on three deep learning models such as Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Bi-directional LSTM (BLSTM) to detect mutation in gene sequences to detect cancer progression in an early stage. The proposed model is implemented on breast, thyroid, lower-grade glioma, sarcoma, and gastric cancer. The driver genes in breast, thyroid, lower-grade glioma, sarcoma, and gastric cancer are 99, 40, 38, 8, and 61, respectively. Different feature extraction algorithms are applied to these gene sequences. The learning approaches are validated and tested using three different testing techniques: self-consistency test, independent set test and a 10-Fold cross-validation test. Then, multiple statistical tools are used to evaluate the performance of the proposed model such as accuracy, sensitivity, specificity, Mathew's correlation coefficient, ROC curve and decision boundary. The proposed study shows the highest accuracy of 99% for the identification of breast adenocarcinoma, thyroid adenocarcinoma, lower glioma, sarcoma and gastric cancer using Bi-LSTM shown in Table XII, XIII, XIV, XV and XVI. None of the previous studies uses ensemble leaning approach for the identification of any type of cancer. The current proposed study is first that focus on this and provide the state of the art results. The ensemble learning approach shows the highest accuracy of 96% for breast cancer, 93% for thyroid cancer, 97% for lower glioma cancer, 80% for sarcoma and 96% for gastric cancer. This is the highest accuracy of the identification of these type of cancer till date. The comparison of the results with the previous studies are explained in Table XVII.