57
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Species-Wide Inventory of NLR Genes and Alleles in Arabidopsis thaliana

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Summary

          Infectious disease is both a major force of selection in nature and a prime cause of yield loss in agriculture. In plants, disease resistance is often conferred by nucleotide-binding leucine-rich repeat (NLR) proteins, intracellular immune receptors that recognize pathogen proteins and their effects on the host. Consistent with extensive balancing and positive selection, NLRs are encoded by one of the most variable gene families in plants, but the true extent of intraspecific NLR diversity has been unclear. Here, we define a nearly complete species-wide pan-NLRome in Arabidopsis thaliana based on sequence enrichment and long-read sequencing. The pan-NLRome largely saturates with approximately 40 well-chosen wild strains, with half of the pan-NLRome being present in most accessions. We chart NLR architectural diversity, identify new architectures, and quantify selective forces that act on specific NLRs and NLR domains. Our study provides a blueprint for defining pan-NLRomes.

          Graphical Abstract

          Highlights

          • Species-wide NLR diversity is high but not unlimited

          • A large fraction of NLR diversity is recovered with 40–50 accessions

          • Presence/absence variation in NLRs is widespread, resulting in a mosaic population

          • A high diversity of NLR-integrated domains favor known virulence targets

          Abstract

          In plants, NLR proteins are important intracellular receptors with roles in innate immunity and disease resistance. This work provides a panoramic view of this diverse and complicated gene family in the model species A. thaliana and provides a foundation for the identification and functional study of disease-resistance genes in agronomically important species with complex genomes.

          Related collections

          Most cited references71

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data

          The Environment for Tree Exploration (ETE) is a computational framework that simplifies the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. Here, we present ETE v3, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common tasks in comparative genomics and phylogenetics. The new features include (i) building gene-based and supermatrix-based phylogenies using a single command, (ii) testing and visualizing evolutionary models, (iii) calculating distances between trees of different size or including duplications, and (iv) providing seamless integration with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources

            Background In order to improve gene prediction, extrinsic evidence on the gene structure can be collected from various sources of information such as genome-genome comparisons and EST and protein alignments. However, such evidence is often incomplete and usually uncertain. The extrinsic evidence is usually not sufficient to recover the complete gene structure of all genes completely and the available evidence is often unreliable. Therefore extrinsic evidence is most valuable when it is balanced with sequence-intrinsic evidence. Results We present a fairly general method for integration of external information. Our method is based on the evaluation of hints to potentially protein-coding regions by means of a Generalized Hidden Markov Model (GHMM) that takes both intrinsic and extrinsic information into account. We used this method to extend the ab initio gene prediction program AUGUSTUS to a versatile tool that we call AUGUSTUS+. In this study, we focus on hints derived from matches to an EST or protein database, but our approach can be used to include arbitrary user-defined hints. Our method is only moderately effected by the length of a database match. Further, it exploits the information that can be derived from the absence of such matches. As a special case, AUGUSTUS+ can predict genes under user-defined constraints, e.g. if the positions of certain exons are known. With hints from EST and protein databases, our new approach was able to predict 89% of the exons in human chromosome 22 correctly. Conclusion Sensitive probabilistic modeling of extrinsic evidence such as sequence database matches can increase gene prediction accuracy. When a match of a sequence interval to an EST or protein sequence is used it should be treated as compound information rather than as information about individual positions.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses in R

              Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a population genomics package for the R software environment (a de facto standard for statistical analyses). PopGenome can efficiently process genome-scale data as well as large sets of individual loci. It reads DNA alignments and single-nucleotide polymorphism (SNP) data sets in most common formats, including those used by the HapMap, 1000 human genomes, and 1001 Arabidopsis genomes projects. PopGenome also reads associated annotation files in GFF format, enabling users to easily define regions or classify SNPs based on their annotation; all analyses can also be applied to sliding windows. PopGenome offers a wide range of diverse population genetics analyses, including neutrality tests as well as statistics for population differentiation, linkage disequilibrium, and recombination. PopGenome is linked to Hudson’s MS and Ewing’s MSMS programs to assess statistical significance based on coalescent simulations. PopGenome’s integration in R facilitates effortless and reproducible downstream analyses as well as the production of publication-quality graphics. Developers can easily incorporate new analyses methods into the PopGenome framework. PopGenome and R are freely available from CRAN (http://cran.r-project.org/) for all major operating systems under the GNU General Public License.
                Bookmark

                Author and article information

                Contributors
                Journal
                Cell
                Cell
                Cell
                Cell Press
                0092-8674
                1097-4172
                22 August 2019
                22 August 2019
                : 178
                : 5
                : 1260-1272.e14
                Affiliations
                [1 ]Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
                [2 ]Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
                [3 ]Department of Biology, University of North Carolina, Chapel Hill, NC 27599-3280, USA
                [4 ]Center for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, 08193 Barcelona, Spain
                [5 ]Department of Biology, Colorado State University, Fort Collins, CO 80523, USA
                [6 ]The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich NR4 7UH, UK
                [7 ]Milner Centre for Evolution & Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
                Author notes
                []Corresponding author jonathan.jones@ 123456tsl.ac.uk
                [∗∗ ]Corresponding author dangl@ 123456email.unc.edu
                [∗∗∗ ]Corresponding author weigel@ 123456weigelworld.org
                [8]

                These authors contributed equally

                [9]

                Lead contact

                Article
                S0092-8674(19)30837-2
                10.1016/j.cell.2019.07.038
                6709784
                31442410
                7a80fcc2-c94d-4715-b7c8-6a118b67e822
                © 2019 The Author(s)

                This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

                History
                : 5 March 2019
                : 13 June 2019
                : 19 July 2019
                Categories
                Article

                Cell biology
                nlr,innate immunity,plant immunity,disease resistance genes,smrt sequencing,renseq,sequence capture,targeted enrichment,genomics,integrated domains

                Comments

                Comment on this article