Longest Common Subsequences in Bacteria Taxonomic Classification

Mehmet Can, Osman Gursoy

Abstract


In 1980s, Carl Woese made a ground breaking contribution to microbiology using rRNA-genes for phylogenetic classifications. He used it not only to explore microbial diversity but also as a method for bacterial annotation. Today, rRNA-based analysis remains a central method in microbiology. Many researchers followed this track, using several new generations of Artificial Neural Networks obtained high accuracies using available datasets of their time. By the time, the number of bacteria increased enormously. In this article we used Longest Common Subsequence similarity measure to classify bacterial 16S rRNA gene sequences of 1.820.414 bacteria in SILVA, 3.196.038 bacteria in RDP, and 198.509 bacteria in Greengenes. The last two taxonomy have six taxonomical levels, phylum, class, order, family, genus, and species, while SILVA has two more levels subclass and suborder, but lacks species level. The majority of classifications (98%) were of high accuracy (98%).

Keywords


16S ribosomal RNA; gene segments; diagnosis; bacteria annotation

Full Text:

PDF


DOI: http://dx.doi.org/10.21533/scjournal.v7i2.166

Refbacks

  • There are currently no refbacks.


Copyright (c) 2018 Mehmet Can, Osman Gursoy

ISSN 2233 -1859

Digital Object Identifier DOI: 10.21533/scjournal

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License