Protein Secondary Structure Prediction Based on Physicochemical Features and PSSM by KNN

Faruk Berat Akcesme

Abstract


In this paper, we propose a protein secondary structure prediction method based on the k-nearest neighborhood (KNN) technique with position-specific scoring matrix (PSSM) profiles, propensity matrix of amino acids in three conformations (HEC) and three physicochemical features; hydrophobicity, net charges, and side chain mass. First, the KNN with the optimal k-value is found. Then, the Euclidean distance of 26-dimensional data for each amino acid of a protein, to the data vectors of all other proteins are computed. The conformations of the nearest seven amino acids are pooled. Majority of the pooled votes is given to the amino acid of the quarry protein as the conformation H, E, or C. Finally, we use a filter to refine the predicted results from KNN. After filtering, the accuracy of the prediction goes up to the level of 90% for some proteins. This validates that considering PSSM, the propensity matrix, and physicochemical features may exhibit better performance.

Keywords


Protein secondary structure prediction; PSSM profiles; physicochemical features; PDB25 dataset

Full Text:

PDF


DOI: http://dx.doi.org/10.21533/scjournal.v4i1.89

Refbacks

  • There are currently no refbacks.


Copyright (c) 2015 Faruk Berat Akcesme

ISSN 2233 -1859

Digital Object Identifier DOI: 10.21533/scjournal

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License