Predictive model for lung cancer

Loading...
Thumbnail Image
Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
UMT, Lahore
Abstract
The disease of lung Cancer is one of the biggest health challenges of this modern period. Millions of people around the world are suffering from this disease. It is also the second most common disease in Pakistan. Any effective treatment, which may involve radiation, chemotherapy, and surgery, depends on early detection and its prevention can be done by finding its risk factors and severity stage. Machine learning algorithms provide useful knowledge in a variety of fields by using data patterns to predict and make decision regarding future outcomes. In this study, we aim to use many machine learning algorithms to analyze risk factors of the lung cancer as well as its severity. We get the dataset from one of the international hospitals, which contains data of nearly five hundred healthy people and one thousand patients, which are suffering from lung cancer disease. This dataset includes various male and female patients which are at different stages of cancer. By using random forest algorithm and neural networks, which are finest machine learning algorithms, we find the risk factors of lung cancer disease. Results shows that the coughing of blood, obesity and passive smoker are more severe risk factors with respect to random forest, having weight of 22%, 15% and 12% respectively. While, the results with respect to neural networks shows that alcohol usage has load of 37% while coughing of blood and air pollution have load of 36% and 30%respectively. Also, by using random forest algorithm and neural networks, we detect the severity of this disease among those suffering patients, with high and accurate percentage. Our model train using random forest exhibits 98.6% accuracy, 98.2% precision, 99.0%recall and f1 score is 0.987. While, in case of neural network our model exhibits 97.3%accuracy, 96.7% precision, recall 98.0% and f1-score is 0.972. Random forest achieves high accuracy as compare to neural network because neural network needs larger datasets to achieve optimal results. Our models have significant detection result and, both can be used to make better decisions in healthcare regarding future
Description
Keywords
Citation
Collections