Prediction of Protein Solubility in Escherichia coli and Experimental Verification
Loading...
Date
2017
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Management & Technology
Abstract
Soluble protein in proper concentration is very important for different experimental studies. Solubility of protein can be estimated by the sequence of amino acids in protein. The solubility of protein is important for biophysical and structural development. To achieve the soluble protein in high concentration is a major challenge. The protein which are heterologous expressed are often insoluble and their solubilization is highly trial and error process with low success rate. Although very highly overexpression in inclusion body is some time desirable which result in clean protein. A new method is develop which will predict the solubility of protein on overexpression in E.coli. This method use four classifier named as Multilayer Perceptron, Decision Tree, Random Forest, Bayes Classifier. Theses classifier were trained for the prediction of recombinant protein solubility. Many features are used by this method such as canonical variable (CV),Surrounding hydrophobicity, Solubility index composition, Intrinsic aggregation
propensity, Intrinsic Z-scores for aggregation, = tripeptide score, AI = aliphatic index, II= instability index, Fn= frequency of occurrence of Asn, Ft = frequency of occurrence of Thr, Fy= frequency of occurrence of Tyr. It is very simple and easy method for the prediction of recombinant protein solubility. To evaluate the validity of this method test is performed. For this purpose dataset consist of 1500 proteins, out of which 1000 are soluble and 500 are insoluble. Each classifier was trained for the prediction of 450 protein sequences. This method will predict the protein solubility with greater accuracy of about 95.9%. The accuracy of this method is also compared with the previous work or methods. Results shows that this method has more accuracy and precision then other previous works.
Description
Dr. Nouman Rasool
Keywords
Protein, Zwitter ion, Soluble ∕insoluble protein, MS