34
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes

      letter

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Methods for evaluating the quality of genomic and metagenomic data are essential to aid genome assembly procedures and to correctly interpret the results of subsequent analyses. BUSCO estimates the completeness and redundancy of processed genomic data based on universal single-copy orthologs. Here, we present new functionalities and major improvements of the BUSCO software, as well as the renewal and expansion of the underlying data sets in sync with the OrthoDB v10 release. Among the major novelties, BUSCO now enables phylogenetic placement of the input sequence to automatically select the most appropriate BUSCO data set for the assessment, allowing the analysis of metagenome-assembled genomes of unknown origin. A newly introduced genome workflow increases the efficiency and runtimes especially on large eukaryotic genomes. BUSCO is the only tool capable of assessing both eukaryotic and prokaryotic species, and can be applied to various data types, from genome assemblies and metagenomic bins, to transcriptomes and gene sets.

          Related collections

          Most cited references23

          • Record: found
          • Abstract: found
          • Article: not found

          BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

          Genomics has revolutionized biological research, but quality assessment of the resulting assembled sequences is complicated and remains mostly limited to technical measures like N50.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes

            Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of “marker” genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Prodigal: prokaryotic gene recognition and translation initiation site identification

              Background The quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement. Increasing the number of correct identifications, both of genes and of the translation initiation sites for each gene, and reducing the overall number of false positives, are all desirable goals. Results With our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused specifically on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives. Conclusion We built a fast, lightweight, open source gene prediction program called Prodigal http://compbio.ornl.gov/prodigal/. Prodigal achieved good results compared to existing methods, and we believe it will be a valuable asset to automated microbial annotation pipelines.
                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Mol Biol Evol
                Mol Biol Evol
                molbev
                Molecular Biology and Evolution
                Oxford University Press
                0737-4038
                1537-1719
                October 2021
                28 July 2021
                28 July 2021
                : 38
                : 10
                : 4647-4654
                Affiliations
                [1 ]Department of Genetic Medicine and Development, University of Geneva , Geneva, Switzerland
                [2 ]Swiss Institute of Bioinformatics , Geneva, Switzerland
                Author notes

                Mosè Manni, Matthew R Berkeley and Mathieu Seppey authors contributed equally to this work.

                Corresponding author: E-mail: evgeny.zdobnov@ 123456unige.ch .
                Author information
                https://orcid.org/0000-0002-4146-6523
                Article
                msab199
                10.1093/molbev/msab199
                8476166
                34320186
                a0e34a59-afba-4129-a753-6cc584d0c057
                © The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                Page count
                Pages: 8
                Funding
                Funded by: Swiss National Science Foundation, DOI 10.13039/501100001711;
                Award ID: 310030_189062
                Categories
                Resources
                AcademicSubjects/SCI01130
                AcademicSubjects/SCI01180

                Molecular biology
                quality assessment,completeness,genome,transcriptome,prokaryotes,eukaryotes,viruses,microbes,metagenomes

                Comments

                Comment on this article