80
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The accuracy of microbial community surveys based on marker-gene and metagenomic sequencing (MGS) suffers from the presence of contaminants—DNA sequences not truly present in the sample. Contaminants come from various sources, including reagents. Appropriate laboratory practices can reduce contamination, but do not eliminate it. Here we introduce decontam ( https://github.com/benjjneb/decontam), an open-source R package that implements a statistical classification procedure that identifies contaminants in MGS data based on two widely reproduced patterns: contaminants appear at higher frequencies in low-concentration samples and are often found in negative controls.

          Results

          Decontam classified amplicon sequence variants (ASVs) in a human oral dataset consistently with prior microscopic observations of the microbial taxa inhabiting that environment and previous reports of contaminant taxa. In metagenomics and marker-gene measurements of a dilution series, decontam substantially reduced technical variation arising from different sequencing protocols. The application of decontam to two recently published datasets corroborated and extended their conclusions that little evidence existed for an indigenous placenta microbiome and that some low-frequency taxa seemingly associated with preterm birth were contaminants.

          Conclusions

          Decontam improves the quality of metagenomic and marker-gene sequencing by identifying and removing contaminant DNA sequences. Decontam integrates easily with existing MGS workflows and allows researchers to generate more accurate profiles of microbial communities at little to no additional cost.

          Electronic supplementary material

          The online version of this article (10.1186/s40168-018-0605-2) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: found
          • Article: not found

          Metagenomics: genomic analysis of microbial communities.

          Uncultured microorganisms comprise the majority of the planet's biological diversity. Microorganisms represent two of the three domains of life and contain vast diversity that is the product of an estimated 3.8 billion years of evolution. In many environments, as many as 99% of the microorganisms cannot be cultured by standard techniques, and the uncultured fraction includes diverse organisms that are only distantly related to the cultured ones. Therefore, culture-independent methods are essential to understand the genetic diversity, population structure, and ecological roles of the majority of microorganisms. Metagenomics, or the culture-independent genomic analysis of an assemblage of microorganisms, has potential to answer fundamental questions in microbial ecology. This review describes progress toward understanding the biology of uncultured Bacteria, Archaea, and viruses through metagenomic analyses.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Bacterial diversity in the oral cavity of 10 healthy individuals.

            The composition of the oral microbiota from 10 individuals with healthy oral tissues was determined using culture-independent techniques. From each individual, 26 specimens, each from different oral sites at a single point in time, were collected and pooled. An 11th pool was constructed using portions of the subgingival specimens from all 10 individuals. The 16S ribosomal RNA gene was amplified using broad-range bacterial primers, and clone libraries from the individual and subgingival pools were constructed. From a total of 11,368 high-quality, nonchimeric, near full-length sequences, 247 species-level phylotypes (using a 99% sequence identity threshold) and 9 bacterial phyla were identified. At least 15 bacterial genera were conserved among all 10 individuals, with significant interindividual differences at the species and strain level. Comparisons of these oral bacterial sequences with near full-length sequences found previously in the large intestines and feces of other healthy individuals suggest that the mouth and intestinal tract harbor distinct sets of bacteria. Co-occurrence analysis showed significant segregation of taxa when community membership was examined at the level of genus, but not at the level of species, suggesting that ecologically significant, competitive interactions are more apparent at a broader taxonomic level than species. This study is one of the more comprehensive, high-resolution analyses of bacterial diversity within the healthy human mouth to date, and highlights the value of tools from macroecology for enhancing our understanding of bacterial ecology in human health.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Inherent bacterial DNA contamination of extraction and sequencing reagents may affect interpretation of microbiota in low bacterial biomass samples

              Background The advent and use of highly sensitive molecular biology techniques to explore the microbiota and microbiome in environmental and tissue samples have detected the presence of contaminating microbial DNA within reagents. These microbial DNA contaminants may distort taxonomic distributions and relative frequencies in microbial datasets, as well as contribute to erroneous interpretations and identifications. Results We herein report on the occurrence of bacterial DNA contamination within commonly used DNA extraction kits and PCR reagents and the effect of these contaminates on data interpretation. When compared to previous reports, we identified an additional 88 bacterial genera as potential contaminants of molecular biology grade reagents, bringing the total number of known contaminating microbes to 181 genera. Many of the contaminants detected are considered normal inhabitants of the human gastrointestinal tract and the environment and are often indistinguishable from those genuinely present in the sample. Conclusions Laboratories working on bacterial populations need to define contaminants present in all extraction kits and reagents used in the processing of DNA. Any unusual and/or unexpected findings need to be viewed as possible contamination as opposed to unique findings. Electronic supplementary material The online version of this article (doi:10.1186/s13099-016-0103-7) contains supplementary material, which is available to authorized users.
                Bookmark

                Author and article information

                Contributors
                (919) 515-8536 , benjamin.j.callahan@gmail.com
                Journal
                Microbiome
                Microbiome
                Microbiome
                BioMed Central (London )
                2049-2618
                17 December 2018
                17 December 2018
                2018
                : 6
                : 226
                Affiliations
                [1 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Microbiology and Immunology, , Stanford University School of Medicine, ; Stanford, CA 94305 USA
                [2 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Medicine, , Stanford University School of Medicine, ; Stanford, CA 94305 USA
                [3 ]ISNI 0000 0001 2297 6811, GRID grid.266102.1, Department of Orofacial Sciences, , University of California, San Francisco School of Dentistry, ; San Francisco, CA 94143 USA
                [4 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Statistics, , Stanford University, ; Stanford, CA 94305 USA
                [5 ]ISNI 0000 0004 0419 2556, GRID grid.280747.e, Infectious Diseases Section, , Veterans Affairs Palo Alto Health Care System, ; Palo Alto, CA 94304 USA
                [6 ]ISNI 0000 0001 2173 6074, GRID grid.40803.3f, Department of Population Health and Pathobiology, College of Veterinary Medicine, , North Carolina State University, ; 456 Research Building, 1060 William Moore Drive, Raleigh, NC 27607 USA
                [7 ]ISNI 0000 0001 2173 6074, GRID grid.40803.3f, Bioinformatics Research Center, , North Carolina State University, ; Raleigh, NC 27695 USA
                Author information
                http://orcid.org/0000-0002-8752-117X
                Article
                605
                10.1186/s40168-018-0605-2
                6298009
                30558668
                0ddedcef-6fdd-4839-aa86-b9174c293010
                © The Author(s). 2018

                Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 25 July 2018
                : 25 November 2018
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000072, National Institute of Dental and Craniofacial Research;
                Award ID: R01 DE023113
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/100000060, National Institute of Allergy and Infectious Diseases;
                Award ID: R01 AI112401
                Award Recipient :
                Categories
                Methodology
                Custom metadata
                © The Author(s) 2018

                microbiome,metagenomics,marker-gene,16s rrna gene,dna contamination

                Comments

                Comment on this article