Accuracy of Identical Subsequences Based Protein Secondary Structure Prediction

Faruk Berat Akcesme, Muhamed Adilovic, Mehmet Can

Abstract


Chou, and Fasman developed the first empirical prediction method to predict secondary structure of proteins from their amino acid sequences. Subsequently, a more sophisticated GOR method has been developed. Although it became very popular among biologists, their accuracy was only slightly better than random. A significant improvement in prediction accuracy >70% has been achieved by ‘second generation’ methods such as PHD, SAM-T98, and PSIPRED, which utilized information concerning sequence conservation. Only recently F. B. Akcesme developed a local similarity based method to obtain an accuracy >90%in secondary structure prediction of any new protein. In this article we examined the possibility of sequence similarity based secondary structure prediction of proteins. To deal with this issue, all proteins of PDB dataset are searched for identical subsequences in the other larger proteins of PDB dataset. It is seen that around 17% of proteins in the PDB dataset have identical subsequences in other larger proteins of PDB dataset. When the secondary structures of proteins are assigned as the corresponding secondary structures of identical parts in other larger proteins, the average prediction accuracy is found to be 90.39 %. Therefore, we concluded that an unknown protein has a chance of 17 % to have an identical subsequence in a larger protein in Protein Data Bank (PDB), and there is a possibility that its secondary structure be predicted with around 90% accuracy with this method.

Keywords


Protein Secondary Structure Prediction; PDB; Sequence similarity

Full Text:

PDF


DOI: http://dx.doi.org/10.21533/scjournal.v6i1.134

Refbacks

  • There are currently no refbacks.


Copyright (c) 2017 Faruk Berat Akcesme, Muhamed Adilovic, Mehmet Can

ISSN 2233 -1859

Digital Object Identifier DOI: 10.21533/scjournal

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License