Sequence-Based Prediction of Multiple Lipid Modification Sites in Proteins by Integration of Pseaac and Statistical Moments

Loading...
Thumbnail Image
Date
2018
Journal Title
Journal ISSN
Volume Title
Publisher
University of Management & Technology
Abstract
Lipid modification of a protein, which can be co-translational or post-translational, is known for regulation of various physiological factors, such as protein-membrane interactions, protein-protein interactions, protein stabilization and enzymatic functionality. Due to the association of these lipid modification sites with various diseases, its timely prediction can help in diagnosing and controlling the associated fatal diseases. Here, we present a method for prediction of multiple lipid modification sites, in which we have incorporated PseAAC with statistical moments for the prediction. The aim of this study is to propose a new and more accurate predictor for lipid modification sites, based on the 5-step rule, to make it easier for the experimental scientists getting desired results. A benchmark dataset of 893 positive and 1093 negative samples for NMyristoylG-PseAAC, 90 positive and 100 negative samples for SFarnesylC-PseAAC, 74 positive and 100 negative samples for SGeranylgeranylC-PseAAC, and 436 positive and 500 negative samples for SPalmitoylC -PseAAC, is collected and used in this study. For feature vector, various position and composition relative features along with the statistical moments are calculated. Later on, a back propagation neural network is trained using feature vectors and scaled conjugate gradient descent with adaptive learning is used as an optimizer. Self-consistency testing and 10-fold cross-validation are performed to evaluate the performance of predictors, using accuracy metrics. For self-consistency testing of NMyristoylG-PseAAC, 96.93% Acc, 97.09% Sp, 96.80% Sn and 0.94 MCC is observed, whereas, for 10-fold cross validation 94.41% Acc, 94.06% Sp, 94.70% Sn and 0.89 MCC is observed. For self-consistency testing of SFarnesylC-PseAAC, 95.79% Acc, 96.67% Sp, 95.00% Sn and 0.92 MCC is observed, whereas, for 10-fold cross validation 93.68% Acc, 95.56% Sp, 92.00% Sn and 0.87 MCC is observed. For self-consistency testing of SGeranylgeranylC-PseAAC, 95.91% Acc, 95.77% Sp, 96.00% Sn and 0.92 MCC is observed, whereas, for 10-fold cross validation 92.98% Acc, 92.96% Sp, 93.00% Sn and 0.86 MCC is observed. For self-consistency testing of SPalmitoylC-PseAAC, 98.08% Acc, 98.62% Sp, 97.60% Sn and 0.96 MCC is observed, whereas, for 10-fold cross validation 94.66% Acc, 96.79% Sp, 92.80% Sn and 0.89 MCC is observed. Thus the proposed predictor can help in predicting the targeted lipid modification sites in an efficient and accurate way.
Description
Dr. Yaser Daanial Khan
Keywords
SPalmitoylC -PseAAC, NMyristoylG-PseAAC,, MS
Citation