10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The community-curated Pristionchus pacificus genome facilitates automated gene annotation improvement in related nematodes

      research-article
      BMC Genomics
      BioMed Central
      Comparative genomics, Evolution, Phylogeny, Parasite, Caenorhabditis elegans, BUSCO, PPCAC

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The nematode Pristionchus pacificus is an established model organism for comparative studies with Caenorhabditis elegans. Over the past years, it developed into an independent animal model organism for elucidating the genetic basis of phenotypic plasticity. Community-based curations were employed recently to improve the quality of gene annotations of P. pacificus and to more easily facilitate reverse genetic studies using candidate genes from C. elegans.

          Results

          Here, I demonstrate that the reannotation of phylogenomic data from nine related nematode species using the community-curated P. pacificus gene set as homology data substantially improves the quality of gene annotations. Benchmarking of universal single copy orthologs (BUSCO) estimates a median completeness of 84% which corresponds to a 9% increase over previous annotations. Nevertheless, the ability to infer gene models based on homology already drops beyond the genus level reflecting the rapid evolution of nematode lineages. This also indicates that the highly curated C. elegans genome is not optimally suited for annotating non- Caenorhabditis genomes based on homology. Furthermore, comparative genomic analysis of apparently missing BUSCO genes indicates a failure of ortholog detection by the BUSCO pipeline due to the insufficient sample size and phylogenetic breadth of the underlying OrthoDB data set. As a consequence, the quality of multiple divergent nematode genomes might be underestimated.

          Conclusions

          This study highlights the need for optimizing gene annotation protocols and it demonstrates the benefit of a high quality genome for phylogenomic data of related species.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s12864-021-07529-x.

          Related collections

          Most cited references55

          • Record: found
          • Abstract: found
          • Article: not found

          BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

          Genomics has revolutionized biological research, but quality assessment of the resulting assembled sequences is complicated and remains mostly limited to technical measures like N50.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            MUSCLE: a multiple sequence alignment method with reduced time and space complexity

            Background In a previous paper, we introduced MUSCLE, a new program for creating multiple alignments of protein sequences, giving a brief summary of the algorithm and showing MUSCLE to achieve the highest scores reported to date on four alignment accuracy benchmarks. Here we present a more complete discussion of the algorithm, describing several previously unpublished techniques that improve biological accuracy and / or computational complexity. We introduce a new option, MUSCLE-fast, designed for high-throughput applications. We also describe a new protocol for evaluating objective functions that align two profiles. Results We compare the speed and accuracy of MUSCLE with CLUSTALW, Progressive POA and the MAFFT script FFTNS1, the fastest previously published program known to the author. Accuracy is measured using four benchmarks: BAliBASE, PREFAB, SABmark and SMART. We test three variants that offer highest accuracy (MUSCLE with default settings), highest speed (MUSCLE-fast), and a carefully chosen compromise between the two (MUSCLE-prog). We find MUSCLE-fast to be the fastest algorithm on all test sets, achieving average alignment accuracy similar to CLUSTALW in times that are typically two to three orders of magnitude less. MUSCLE-fast is able to align 1,000 sequences of average length 282 in 21 seconds on a current desktop computer. Conclusions MUSCLE offers a range of options that provide improved speed and / or alignment accuracy compared with currently available programs. MUSCLE is freely available at .
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              phangorn: phylogenetic analysis in R

              Summary: phangorn is a package for phylogenetic reconstruction and analysis in the R language. Previously it was only possible to estimate phylogenetic trees with distance methods in R. phangorn, now offers the possibility of reconstructing phylogenies with distance based methods, maximum parsimony or maximum likelihood (ML) and performing Hadamard conjugation. Extending the general ML framework, this package provides the possibility of estimating mixture and partition models. Furthermore, phangorn offers several functions for comparing trees, phylogenetic models or splits, simulating character data and performing congruence analyses. Availability: phangorn can be obtained through the CRAN homepage http://cran.r-project.org/web/packages/phangorn/index.html. phangorn is licensed under GPL 2. Contact: klaus.kschliep@snv.jussieu.fr Supplementary information: Supplementary data are available at Bioinformatics online.
                Bookmark

                Author and article information

                Contributors
                christian.roedelsperger@tuebingen.mpg.de
                Journal
                BMC Genomics
                BMC Genomics
                BMC Genomics
                BioMed Central (London )
                1471-2164
                25 March 2021
                25 March 2021
                2021
                : 22
                : 216
                Affiliations
                GRID grid.419495.4, ISNI 0000 0001 1014 8330, Department for Integrative Evolutionary Biology, , Max Planck Institute for Developmental Biology, ; Max-Planck-Ring 9, 72076 Tübingen, Germany
                Author information
                http://orcid.org/0000-0002-7905-9675
                Article
                7529
                10.1186/s12864-021-07529-x
                7992802
                33765927
                e6ee707c-ed30-46ca-b8ce-e5be5af5a3a4
                © The Author(s) 2021

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 9 November 2020
                : 12 March 2021
                Funding
                Funded by: Max-Planck-Gesellschaft (DE)
                Funded by: Max Planck Institute for Developmental Biology (2)
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2021

                Genetics
                comparative genomics,evolution,phylogeny,parasite,caenorhabditis elegans,busco,ppcac
                Genetics
                comparative genomics, evolution, phylogeny, parasite, caenorhabditis elegans, busco, ppcac

                Comments

                Comment on this article