Abstract
The HapMap project has facilitated the selection of tagging single nucleotide polymorphisms (tagSNPs) for genome-wide association studies (GWAS) under the assumption that linkage disequilibrium (LD) in the HapMap populations is similar to the populations under investigation. Earlier reports support this assumption, although in most of these studies only a few loci were evaluated. We compared pair-wise LD and LD block structure across autosomes between the Dutch population and the CEU-HapMap reference panel. The impact of sampling distribution on the estimation of LD blocks was studied by bootstrapping. A high Pearson correlation (genome-wide; 0.93) between pair-wise r2 for the Dutch and the CEU populations was found, indicating that tagSNPs from the CEU-HapMap panel capture common variation in the Dutch population. However, some genomic regions exhibited, significantly lower correlation than the genome-wide estimate. This might decrease the validity of HapMap tagSNPs in these regions and the power of GWAS. The LD block structure differed considerably between the Dutch and CEU-HapMap populations. This was not explained by demographic differences between the CEU and Dutch samples, as testing for population stratification was not significant. We also found that sampling variation had a large effect on the estimation of LD blocks, as shown by the bootstrapping analysis. Thus, in small samples, most of the observed differences in LD blocks between populations are most likely the result of sampling variation. This poor concordance in LD block structure suggests that large samples are required for robust estimations of local LD block structure in populations.
Original language | English |
---|---|
Pages (from-to) | 802-810 |
Number of pages | 9 |
Journal | European Journal of Human Genetics |
Volume | 17 |
Issue number | 6 |
DOIs | |
Publication status | Published - 7 Jan 2009 |
Externally published | Yes |
Bibliographical note
Funding Information:We would like to acknowledge support from NWO: genetic basis of anxiety and depression (904-61-090); resolving cause and effect in the association between exercise and well-being (904-61-193); twin-family database for behavior genomics studies (480-04-004); twin research focusing on behavior (400-05-717); Center for Medical Systems Biology (NWO Genomics); Spinozapremie (SPI 56-464-14192); NWO-VI016-065-318; Centre for Neurogenomics and Cognitive Research (CNCR-VU); genomewide analyses of European twin and population cohorts (EU/QLRT-2001-01254); genome scan for neuroticism (NIMH R01 MH059160); Geestkracht program of ZonMW (10-000-1002); matching funds from universities and mental health care institutes involved in NESDA (GGZ Buitenamstel-Geestgron84den, Rivierduinen, University Medical Center Groningen, GGZ Lentis, GGZ Friesland, GGZ Drenthe). Major funding for this project is from the Genetic Association Information Network of the Foundation for the US National Institutes of Health, a public–private partnership between the NIH and Pfizer Inc., Affyme-trix Inc. and Abbott Laboratories. Genetic Cluster Computer is financially supported by the Netherlands Scientific Organization (NWO 480-05-003). Stichting Nationale Computerfaciliteiten – NCF (SH-104-08 Grant) is also acknowledged.