87
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      An Evaluation of the Performance of Tag SNPs Derived from HapMap in a Caucasian Population

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Haplotype Map (HapMap) project recently generated genotype data for more than 1 million single-nucleotide polymorphisms (SNPs) in four population samples. The main application of the data is in the selection of tag single-nucleotide polymorphisms (tSNPs) to use in association studies. The usefulness of this selection process needs to be verified in populations outside those used for the HapMap project. In addition, it is not known how well the data represent the general population, as only 90–120 chromosomes were used for each population and since the genotyped SNPs were selected so as to have high frequencies. In this study, we analyzed more than 1,000 individuals from Estonia. The population of this northern European country has been influenced by many different waves of migrations from Europe and Russia. We genotyped 1,536 randomly selected SNPs from two 500-kbp ENCODE regions on Chromosome 2. We observed that the tSNPs selected from the CEPH (Centre d'Etude du Polymorphisme Humain) from Utah (CEU) HapMap samples (derived from US residents with northern and western European ancestry) captured most of the variation in the Estonia sample. (Between 90% and 95% of the SNPs with a minor allele frequency of more than 5% have an r 2 of at least 0.8 with one of the CEU tSNPs.) Using the reverse approach, tags selected from the Estonia sample could almost equally well describe the CEU sample. Finally, we observed that the sample size, the allelic frequency, and the SNP density in the dataset used to select the tags each have important effects on the tagging performance. Overall, our study supports the use of HapMap data in other Caucasian populations, but the SNP density and the bias towards high-frequency SNPs have to be taken into account when designing association studies.

          Synopsis

          The recent completion of the Haplotype Map (HapMap) project of the human genome provides considerable information on the patterns of variation in the genome of four populations. One of the applications is a description of a set of tags that act as proxies for many other surrounding variants. This will greatly help researchers in their quest to find complex disease genes by reducing the number of genetic variants to test in association studies. To evaluate its usefulness, several aspects of the map, including its transferability to other populations, still needed to be verified experimentally. Using genomic regions where variants had been thoroughly documented in Caucasian samples from Estonia, the researchers found that the transferability of tags is extremely good. The researchers also found that variants with low frequency in the general population (i.e., less than 5%) could not be accurately captured with tags, and that the regional density of variants in the HapMap project had a major impact on the performance of the tags. This research indicates that the HapMap project will be useful, but that careful consideration of hypotheses and study design will be essential for the success of association studies.

          Related collections

          Most cited references17

          • Record: found
          • Abstract: found
          • Article: not found

          Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease.

          Association studies offer a potentially powerful approach to identify genetic variants that influence susceptibility to common disease, but are plagued by the impression that they are not consistently reproducible. In principle, the inconsistency may be due to false positive studies, false negative studies or true variability in association among different populations. The critical question is whether false positives overwhelmingly explain the inconsistency. We analyzed 301 published studies covering 25 different reported associations. There was a large excess of studies replicating the first positive reports, inconsistent with the hypothesis of no true positive associations (P < 10(-14)). This excess of replications could not be reasonably explained by publication bias and was concentrated among 11 of the 25 associations. For 8 of these 11 associations, pooled analysis of follow-up studies yielded statistically significant replication of the first report, with modest estimated genetic effects. Thus, a sizable fraction (but under half) of reported associations have strong evidence of replication; for these, false negative, underpowered studies probably contribute to inconsistent replication. We conclude that there are probably many common variants in the human genome with modest but real effects on common disease risk, and that studies using large samples will convincingly identify such variants.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium.

            Common genetic polymorphisms may explain a portion of the heritable risk for common diseases. Within candidate genes, the number of common polymorphisms is finite, but direct assay of all existing common polymorphism is inefficient, because genotypes at many of these sites are strongly correlated. Thus, it is not necessary to assay all common variants if the patterns of allelic association between common variants can be described. We have developed an algorithm to select the maximally informative set of common single-nucleotide polymorphisms (tagSNPs) to assay in candidate-gene association studies, such that all known common polymorphisms either are directly assayed or exceed a threshold level of association with a tagSNP. The algorithm is based on the r(2) linkage disequilibrium (LD) statistic, because r(2) is directly related to statistical power to detect disease associations with unassayed sites. We show that, at a relatively stringent r(2) threshold (r2>0.8), the LD-selected tagSNPs resolve >80% of all haplotypes across a set of 100 candidate genes, regardless of recombination, and tag specific haplotypes and clades of related haplotypes in nonrecombinant regions. Thus, if the patterns of common variation are described for a candidate gene, analysis of the tagSNP set can comprehensively interrogate for main effects from common functional variation. We demonstrate that, although common variation tends to be shared between populations, tagSNPs should be selected separately for populations with different ancestries.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              The new genomics: global views of biology.

              E S Lander (1996)
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                pgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                March 2006
                10 March 2006
                26 January 2006
                : 2
                : 3
                : e27
                Affiliations
                [1 ] McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada
                [2 ] Institute of Molecular and Cell Biology of the University of Tartu, Tartu, Estonia,
                [3 ] Estonian Biocentre, Tartu, Estonia
                [4 ] Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
                [5 ] The Estonian Genome Project Foundation, Tartu, Estonia
                University of Alabama at Birmingham, United States of America
                Author notes
                * To whom correspondence should be addressed. E-mail: andres@ 123456ebc.ee
                Article
                05-PLGE-RA-0317R3 plge-02-03-03
                10.1371/journal.pgen.0020027
                1391920
                16532062
                dd0638da-f597-4453-ae0d-a3f112991d01
                Copyright: © 2006 Montpetit et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 6 October 2005
                : 23 January 2006
                Page count
                Pages: 9
                Categories
                Research Article
                Genetics/Population Genetics
                Genetics/Genome Projects
                Genetics/Genetics of Disease
                Genetics/Complex Traits
                Homo (Human)
                Custom metadata
                Montpetit A, Nelis M, Laflamme P, Magi R, Ke X, et al. (2006) An evaluation of the performance of tag SNPs derived from HapMap in a Caucasian population. PLoS Genet 2(3): e27.

                Genetics
                Genetics

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content10

                Cited by16

                Most referenced authors931