Cancer type driver classification accuracy using spark ML technology

Abstract
In this paper, analysis of genes extracted from the body has been performed that can be a driver of tumor, resulting In cancer of different types like breast cancer etc. motivated by the BIGBIOCL The Classifier with Alternative and Multiple Rule Based (CAMUR) Is a core algorithm that Is applied here to dissect large datasets. For the purpose of acquiring the desired goal. Apache Spark as well as MLlIb are used on a stack of Hadoop In local mode. The practice has been performed using the decision tree as well as a random forest. As far as the deployed data Is concerned. In terms of measurement of F and efficiency, random forest has shown better results. For the objective of extraction of genes and other pertinent models, deletion of features has been performed with the deployment of an Iterative algorithm as proposed earlier by CAMUR with a modified version. Finally, the extracted results are facilitated to biologists, so they can analyze whether the extraction Is related or can be a driver of cancer.
Description
Keywords
Citation
Collections