Document Type


Publication Date


Publication Title

BMC genomics [electronic resource]


Annotation errors; Comparative genomic analysis; Gene annotation; Gene rearrangement; Mammalian mitochondrial genome


BACKGROUND: Although animal mitochondrial DNA sequences are known to evolve rapidly, their gene arrangements often remain unchanged over long periods of evolutionary time. Therefore, comparisons of mitochondrial genomes may result in significant insights into the evolution both of organisms and of genomes. Mammalian mitochondrial genomes recently published in the GenBank database of NCBI show numerous rearrangements in various regions of the genome, from which it may be inferred that the mammalian mitochondrial genome is more dynamic than expected. However, it is alternatively possible that these are errors of annotation and, if so, are misleading our interpretations. In order to verify these possible errors of annotation, we performed a comparative genomic analysis of mammalian mitochondrial genomes available in the NCBI database.

RESULTS: Using a combination of bioinformatics methods to carefully examine the mitochondrial gene arrangements in 304 mammalian species, we determined that there are only two sets of gene arrangements, one that is shared by all of the marsupials and another that is shared by all of the monotremes and eutherians, with these two arrangements differing only by the positions of tRNA genes in the region commonly designated as "WANCY" for the genes it comprises. All of the 68 other cases of reported gene rearrangements are errors. We note that there are also numerous errors of impossibly short, incorrect gene annotations, cases where genomes that are reported as complete are actually missing portions of the sequence, and genes that are clearly present but were not annotated in these records.

CONCLUSIONS: We judge that the application of simple bioinformatic tools in the verification of gene annotation, particularly for organelle genomes, would be a very useful enhancement for the curation of genome sequences submitted to GenBank.


Institute for Systems Biology