50
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Pedigree reconstruction from SNP data: parentage assignment, sibship clustering and beyond

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Data on hundreds or thousands of single nucleotide polymorphisms (SNPs) provide detailed information about the relationships between individuals, but currently few tools can turn this information into a multigenerational pedigree. I present the r package sequoia, which assigns parents, clusters half‐siblings sharing an unsampled parent and assigns grandparents to half‐sibships. Assignments are made after consideration of the likelihoods of all possible first‐, second‐ and third‐degree relationships between the focal individuals, as well as the traditional alternative of being unrelated. This careful exploration of the local likelihood surface is implemented in a fast, heuristic hill‐climbing algorithm. Distinction between the various categories of second‐degree relatives is possible when likelihoods are calculated conditional on at least one parent of each focal individual. Performance was tested on simulated data sets with realistic genotyping error rate and missingness, based on three different large pedigrees ( =  1000–2000). This included a complex pedigree with overlapping generations, occasional close inbreeding and some unknown birth years. Parentage assignment was highly accurate down to about 100 independent SNPs (error rate <0.1%) and fast (<1 min) as most pairs can be excluded from being parent–offspring based on opposite homozygosity. For full pedigree reconstruction, 40% of parents were assumed nongenotyped. Reconstruction resulted in low error rates (<0.3%), high assignment rates (>99%) in limited computation time (typically <1 h) when at least 200 independent SNPs were used. In three empirical data sets, relatedness estimated from the inferred pedigree was strongly correlated to genomic relatedness.

          Related collections

          Most cited references33

          • Record: found
          • Abstract: found
          • Article: not found

          Statistical confidence for likelihood-based paternity inference in natural populations.

          Paternity inference using highly polymorphic codominant markers is becoming common in the study of natural populations. However, multiple males are often found to be genetically compatible with each offspring tested, even when the probability of excluding an unrelated male is high. While various methods exist for evaluating the likelihood of paternity of each nonexcluded male, interpreting these likelihoods has hitherto been difficult, and no method takes account of the incomplete sampling and error-prone genetic data typical of large-scale studies of natural systems. We derive likelihood ratios for paternity inference with codominant markers taking account of typing error, and define a statistic delta for resolving paternity. Using allele frequencies from the study population in question, a simulation program generates criteria for delta that permit assignment of paternity to the most likely male with a known level of statistical confidence. The simulation takes account of the number of candidate males, the proportion of males that are sampled and gaps and errors in genetic data. We explore the potentially confounding effect of relatives and show that the method is robust to their presence under commonly encountered conditions. The method is demonstrated using genetic data from the intensively studied red deer (Cervus elaphus) population on the island of Rum, Scotland. The Windows-based computer program, CERVUS, described in this study is available from the authors. CERVUS can be used to calculate allele frequencies, run simulations and perform parentage analysis using data from all types of codominant markers.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            COLONY: a program for parentage and sibship inference from multilocus genotype data.

            Pedigrees, depicting genealogical relationships between individuals, are important in several research areas. Molecular markers allow inference of pedigrees in wild species where relationship information is impossible to collect by observation. Marker data are analysed statistically using methods based on Mendelian inheritance rules. There are numerous computer programs available to conduct pedigree analysis, but most software is inflexible, both in terms of assumptions and data requirements. Most methods only accommodate monogamous diploid species using codominant markers without genotyping error. In addition, most commonly used methods use pairwise comparisons rather than a full-pedigree likelihood approach, which considers the likelihood of the entire pedigree structure and allows the simultaneous inference of parentage and sibship. Here, we describe colony, a computer program implementing full-pedigree likelihood methods to simultaneously infer sibship and parentage among individuals using multilocus genotype data. colony can be used for both diploid and haplodiploid species; it can use dominant and codominant markers, and can accommodate, and estimate, genotyping error at each locus. In addition, colony can carry out these inferences for both monoecious and dioecious species. The program is available as a Microsoft Windows version, which includes a graphical user interface, and a Macintosh version, which uses an R-based interface. © 2009 Blackwell Publishing Ltd.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Sibship reconstruction from genetic data with typing errors.

              Likelihood methods have been developed to partition individuals in a sample into full-sib and half-sib families using genetic marker data without parental information. They invariably make the critical assumption that marker data are free of genotyping errors and mutations and are thus completely reliable in inferring sibships. Unfortunately, however, this assumption is rarely tenable for virtually all kinds of genetic markers in practical use and, if violated, can severely bias sibship estimates as shown by simulations in this article. I propose a new likelihood method with simple and robust models of typing error incorporated into it. Simulations show that the new method can be used to infer full- and half-sibships accurately from marker data with a high error rate and to identify typing errors at each locus in each reconstructed sib family. The new method also improves previous ones by adopting a fresh iterative procedure for updating allele frequencies with reconstructed sibships taken into account, by allowing for the use of parental information, and by using efficient algorithms for calculating the likelihood function and searching for the maximum-likelihood configuration. It is tested extensively on simulated data with a varying number of marker loci, different rates of typing errors, and various sample sizes and family structures and applied to two empirical data sets to demonstrate its usefulness.
                Bookmark

                Author and article information

                Contributors
                jisca.huisman@ed.ac.uk
                Journal
                Mol Ecol Resour
                Mol Ecol Resour
                10.1111/(ISSN)1755-0998
                MEN
                Molecular Ecology Resources
                John Wiley and Sons Inc. (Hoboken )
                1755-098X
                1755-0998
                06 April 2017
                September 2017
                : 17
                : 5 ( doiID: 10.1111/men.2017.17.issue-5 )
                : 1009-1024
                Affiliations
                [ 1 ] Ashworth Laboratories School of Biological Sciences Institute for Evolutionary Biology University of Edinburgh Edinburgh EH9 3FL UK
                Author notes
                [*] [* ]Correspondence: Jisca Huisman, E‐mail: jisca.huisman@ 123456ed.ac.uk
                Author information
                http://orcid.org/0000-0002-9744-7196
                Article
                MEN12665
                10.1111/1755-0998.12665
                6849609
                28271620
                2b85cc09-a133-4baa-a5f3-b12907d33cc3
                © 2017 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

                This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

                History
                : 05 September 2016
                : 02 December 2016
                : 24 February 2017
                Page count
                Figures: 10, Tables: 5, Pages: 16, Words: 10014
                Funding
                Funded by: ERC , open-funder-registry 10.13039/501100000781;
                Award ID: ERC‐2009‐AdG
                Funded by: NERC , open-funder-registry 10.13039/501100000270;
                Award ID: NE/L00688X/1
                Categories
                Resource Article
                RESOURCE ARTICLES
                Molecular and Statistical Advances
                Custom metadata
                2.0
                September 2017
                Converter:WILEY_ML3GV2_TO_JATSPMC version:5.7.1 mode:remove_FC converted:12.11.2019

                Ecology
                parentage assignment,pedigree,sequoia,sibship clustering,single nucleotide polymorphism
                Ecology
                parentage assignment, pedigree, sequoia, sibship clustering, single nucleotide polymorphism

                Comments

                Comment on this article