12
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Long-read HiFi sequencing correctly assembles repetitive heavy fibroin silk genes in new moth and caddisfly genomes

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Insect silk is a versatile biomaterial. Lepidoptera and Trichoptera display some of the most diverse uses of silk, with varying strength, adhesive qualities, and elastic properties. Silk fibroin genes are long (>20 Kbp), with many repetitive motifs that make them challenging to sequence. Most research thus far has focused on conserved N- and C-terminal regions of fibroin genes because a full comparison of repetitive regions across taxa has not been possible. Using the PacBio Sequel II system and SMRT sequencing, we generated high fidelity (HiFi) long-read genomic and transcriptomic sequences for the Indianmeal moth ( Plodia interpunctella) and genomic sequences for the caddisfly Eubasilissa regina. Both genomes were highly contiguous (N50  = 9.7 Mbp/32.4 Mbp, L50  = 13/11) and complete (BUSCO complete  = 99.3%/95.2%), with complete and contiguous recovery of silk heavy fibroin gene sequences. We show that HiFi long-read sequencing is helpful for understanding genes with long, repetitive regions.

          Related collections

          Most cited references40

          • Record: found
          • Abstract: found
          • Article: not found

          Minimap2: pairwise alignment for nucleotide sequences

          Heng Li (2018)
          Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes

            Methods for evaluating the quality of genomic and metagenomic data are essential to aid genome assembly procedures and to correctly interpret the results of subsequent analyses. BUSCO estimates the completeness and redundancy of processed genomic data based on universal single-copy orthologs. Here, we present new functionalities and major improvements of the BUSCO software, as well as the renewal and expansion of the underlying data sets in sync with the OrthoDB v10 release. Among the major novelties, BUSCO now enables phylogenetic placement of the input sequence to automatically select the most appropriate BUSCO data set for the assessment, allowing the analysis of metagenome-assembled genomes of unknown origin. A newly introduced genome workflow increases the efficiency and runtimes especially on large eukaryotic genomes. BUSCO is the only tool capable of assessing both eukaryotic and prokaryotic species, and can be applied to various data types, from genome assemblies and metagenomic bins, to transcriptomes and gene sets.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm

              Haplotype-resolved de novo assembly is the ultimate solution to the study of sequence variations in a genome. However, existing algorithms either collapse heterozygous alleles into one consensus copy or fail to cleanly separate the haplotypes to produce high-quality phased assemblies. Here we describe hifiasm, a de novo assembler that takes advantage of long high-fidelity sequence reads to faithfully represent the haplotype information in a phased assembly graph. Unlike other graph-based assemblers that only aim to maintain the contiguity of one haplotype, hifiasm strives to preserve the contiguity of all haplotypes. This feature enables the development of a graph trio binning algorithm that greatly advances over standard trio binning. On three human and five nonhuman datasets, including California redwood with a ~30-Gb hexaploid genome, we show that hifiasm frequently delivers better assemblies than existing tools and consistently outperforms others on haplotype-resolved assembly.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Project administrationRole: SupervisionRole: Writing - original draftRole: Writing - review editing
                Role: ConceptualizationRole: Formal analysisRole: Writing - original draftRole: Writing - review editing
                Role: Formal analysisRole: InvestigationRole: Writing - original draftRole: Writing - review editing
                Role: Data curationRole: Formal analysisRole: InvestigationRole: Writing - original draftRole: Writing - review editing
                Role: Formal analysisRole: InvestigationRole: Writing - original draft
                Role: Data curationRole: Writing - original draftRole: Writing - review editing
                Role: VisualizationRole: Writing - original draftRole: Writing - review editing
                Role: Funding acquisitionRole: Writing - original draftRole: Writing - review editing
                Role: Funding acquisitionRole: Writing - original draftRole: Writing - review editing
                Role: Funding acquisitionRole: Writing - original draftRole: Writing - review editing
                Role: ResourcesRole: Writing - original draftRole: Writing - review editing
                Role: InvestigationRole: Writing - original draftRole: Writing - review editing
                Role: ResourcesRole: Writing - original draftRole: Writing - review editing
                Role: Funding acquisitionRole: Writing - original draftRole: Writing - review editing
                Role: ResourcesRole: Writing - original draftRole: Writing - review editing
                Role: ConceptualizationRole: Formal analysisRole: ResourcesRole: Writing - original draftRole: Writing - review editing
                Journal
                GigaByte
                GigaByte
                Gigabyte
                GigaByte
                GigaScience Press (Sha Tin, New Territories, Hong Kong SAR )
                2709-4715
                30 June 2022
                2022
                : 2022
                : gigabyte64
                Affiliations
                [ 1 ]McGuire Center for Lepidoptera and Biodiversity, Florida Museum of Natural History, University of Florida , Gainesville, FL 32611, USA
                [ 2 ]Pacific Biosciences, 1305 O’Brien Dr., Menlo Park, CA 94025, USA
                [ 3 ]School of Natural Resources and the Environment, University of Florida , Gainesville, FL 32611, USA
                [ 4 ]LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG) , Frankfurt 60325, Germany
                [ 5 ]Department of Terrestrial Zoology, Senckenberg Research Institute and Natural History Museum Frankfurt , Frankfurt 60325, Germany
                [ 6 ]Department of Plant and Wildlife Sciences, Brigham Young University , Provo, UT 84602, USA
                [ 7 ]School of Biological Sciences, Washington State University , Pullman, WA, USA
                [ 8 ]Museum Conservation Institute, Smithsonian Institution , Suitland, MD 20746, USA
                [ 9 ]Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution , Washington, DC 20002, USA
                [ 10 ]Department of Entomology, National Museum of Natural History, Smithsonian Institution , Washington, DC, USA
                [ 11 ]Graduate School of Science, Chiba University , Chiba 263-8522, Japan
                [ 12 ]Kanagawa Institute of Technology , Kanagawa 243-0292, Japan
                [ 13 ]Institute for Insect Biotechnology, Justus-Liebig-University , Gießen 35390, Germany
                [ 14 ]Department of Biomedical Engineering, University of Utah , Salt Lake City, UT 84112, USA
                [ 15 ]Department of Biology, Shinshu University , Matsumoto, Nagano 390-8621, Japan
                Author notes
                [ * ] Corresponding authors. E-mail: kawahara@ 123456flmnh.ufl.edu ; paul_frandsen@ 123456byu.edu
                [ † ]

                Contributed equally.

                Author information
                https://orcid.org/0000-0002-3724-4610
                https://orcid.org/0000-0002-0349-0653
                https://orcid.org/0000-0001-8771-9154
                https://orcid.org/0000-0002-2339-655X
                https://orcid.org/0000-0002-5965-0986
                https://orcid.org/0000-0001-9198-2828
                https://orcid.org/0000-0003-4816-2909
                https://orcid.org/0000-0002-6353-0450
                https://orcid.org/0000-0002-6451-3425
                https://orcid.org/0000-0002-8389-8877
                https://orcid.org/0000-0002-9362-604X
                https://orcid.org/0000-0002-4801-7579
                Article
                DRR-202204-02 64
                10.46471/gigabyte.64
                9693786
                34cd5ca3-d034-44f8-94d1-15aef65b18ba
                © The Author(s) 2022.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 07 April 2022
                : 24 June 2022
                Funding
                Funded by: Smithsonian National Museum of Natural History Global Genome Initiative;
                Award ID: GGI-Peer-2018-182
                Award Recipient :
                Funded by: Smithsonian National Museum of Natural History Global Genome Initiative;
                Award ID: GGI-Peer-2018-182
                Award Recipient :
                Funded by: Smithsonian National Museum of Natural History Global Genome Initiative;
                Award ID: GGI-Peer-2018-182
                Award Recipient :
                Funded by: Smithsonian National Museum of Natural History Global Genome Initiative;
                Award ID: GGI-Peer-2018-182
                Award Recipient :
                Funded by: Smithsonian Museum Conservation Institute Federal;
                Funded by: Trust;
                Award Recipient :
                Funded by: Trust;
                Award Recipient :
                Funded by: University of Florida Research Opportunity Seed Fund;
                Award ID: AWD06265
                Award Recipient :
                Funded by: University of Florida Research Opportunity Seed Fund;
                Award ID: AWD06265
                Award Recipient :
                Funded by: Hessen State Ministry of Higher Education, Research and the Arts (HMWK);
                Award Recipient :
                Funded by: Hessen State Ministry of Higher Education, Research and the Arts (HMWK);
                Award Recipient :
                Funded by: National Science Foundation award;
                Award ID: #OPP-1906015
                Award Recipient :
                This study was funded by the Smithsonian National Museum of Natural History Global Genome Initiative (GGI-Peer-2018-182) to TPC, RD, TD, AYK; the Smithsonian Museum Conservation Institute Federal; and Trust funds to TPC and PBF. A grant from the University of Florida Research Opportunity Seed Fund internal award (number AWD06265) was awarded to principal investigators AYK and CGS. The LOEWE Centre for Translational Biodiversity Genomics (TBG) is funded by the Hessen State Ministry of Higher Education, Research and the Arts (HMWK), which financially supported JH and SUP. SH was supported by National Science Foundation award #OPP-1906015.
                Categories
                Data Release
                Genetics and Genomics
                Animal Genetics
                Evolutionary Biology

                Comments

                Comment on this article