On the Accuracy of the 16S-rRNA Gene Conserved Regions

Osman Gursoy, Mehmet Can

Abstract


The study of microbial communities through sequencing the 16S rRNA gene by the use of high throughput sequencing technology has emerged as a significant improvement for the discipline. However, the short size of these sequences is a limiting factor for the taxonomic classification of bacteria and archaea. These short reads are amplified from DNA, using primers. Although several researchers claim that they succeeded to create the best universal primers, the reality is that no primer has been demonstrated to be truly universal. This suggests that conserved regions of the 16S rRNA gene is not conserved enough. The aim of this study is to evaluate the conservation degree of the conserved regions separating the hypervariable regions of the 16S rRNA genes. Data contained in Greengenes, SILVA, and RDP databases are used for the study. Primers reported as matches of each conserved region were assembled to form fifteen contigs by Martinez-Porchas et al. (2017). Under the information of the degenerate bases in primes these contıgs are multiplied to cover all possibilities of degenerate bases. In Greengenes database there are 198.510 non redundant 16S rRNA genes are reported. This number is 1.488.662 for, SILVA, and 1.350.270 for RDP. To analyze the level of conservation of a contig, one gene is selected from one database, then using the longest common subsequences, for each of these 15 contigs, the longest common subsequences are found between a contig, and a gene. Then the length of longest common subsequence is divided by the length of the contig to get the percentage of conservation of this contig in that gene. This is done for each contig, in the entire databases. Averages revealed that the segments of contigs are not as conserved as expected, 72% in Greengenes, 71% in SILVA, and 57% in RDP. It is concluded that conserved regions of the 16S rRNA genes exhibit considerable variation that has to be considered when using these conserved regions as bases for primer production.

Keywords


Longest common subsequence; Biodiversity; Conserved regions 16S; Primer design

Full Text:

PDF


DOI: http://dx.doi.org/10.21533/scjournal.v8i1.169

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 Osman Gursoy, Mehmet Can

ISSN 2233 -1859

Digital Object Identifier DOI: 10.21533/scjournal

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License