Details zur Publikation

Kategorie Textpublikation
Referenztyp Zeitschriften
DOI 10.3389/fgene.2023.1250907
Lizenz creative commons licence
Titel (primär) Fully automated annotation of mitochondrial genomes using a cluster-based approach with de Bruijn graphs
Autor Fiedler, L.; Middendorf, M.; Bernt, M. ORCID logo
Quelle Frontiers in Genetics
Erscheinungsjahr 2023
Department BIOINF
Band/Volume 14
Seite von art. 1250907
Sprache englisch
Topic T9 Healthy Planet
Daten-/Softwarelinks https://doi.org/10.5281/zenodo.8101631
Supplements https://ndownloader.figstatic.com/files/41952981
Keywords annotation; gene prediction; mitochondria; genome; mitogenome; Metazoa; de Bruijn graph; clustering
Abstract A wide range of scientific fields, such as forensics, anthropology, medicine, and molecular evolution, benefits from the analysis of mitogenomic data. With the development of new sequencing technologies, the amount of mitochondrial sequence data to be analyzed has increased exponentially over the last few years. The accurate annotation of mitochondrial DNA is a prerequisite for any mitogenomic comparative analysis. To sustain with the growth of the available mitochondrial sequence data, highly efficient automatic computational methods are, hence, needed. Automatic annotation methods are typically based on databases that contain information about already annotated (and often pre-curated) mitogenomes of different species. However, the existing approaches have several shortcomings: 1) they do not scale well with the size of the database; 2) they do not allow for a fast (and easy) update of the database; and 3) they can only be applied to a relatively small taxonomic subset of all species. Here, we present a novel approach that does not have any of these aforementioned shortcomings, (1), (2), and (3). The reference database of mitogenomes is represented as a richly annotated de Bruijn graph. To generate gene predictions for a new user-supplied mitogenome, the method utilizes a clustering routine that uses the mapping information of the provided sequence to this graph. The method is implemented in a software package called DeGeCI (De Bruijn graph Gene Cluster Identification). For a large set of mitogenomes, for which expert-curated annotations are available, DeGeCI generates gene predictions of high conformity. In a comparative evaluation with MITOS2, a state-of-the-art annotation tool for mitochondrial genomes, DeGeCI shows better database scalability while still matching MITOS2 in terms of result quality and providing a fully automated means to update the underlying database. Moreover, unlike MITOS2, DeGeCI can be run in parallel on several processors to make use of modern multi-processor systems.
dauerhafte UFZ-Verlinkung https://www.ufz.de/index.php?en=20939&ufzPublicationIdentifier=27798
Fiedler, L., Middendorf, M., Bernt, M. (2023):
Fully automated annotation of mitochondrial genomes using a cluster-based approach with de Bruijn graphs
Front. Genet. 14 , art. 1250907 10.3389/fgene.2023.1250907