An evaluation of the MiDCoP method for imputing allele frequency in genome wide association studies

Yadu Gautam, Carl Lee, Chin I. Cheng, Carl Langefeld

Research output: Contribution to journalArticlepeer-review

Abstract

A genome wide association studies require genotyping DNA sequence of a large sample of individuals with and without the specific disease of interest. The current technologies of genotyping individual DNA sequence only genotype a limited DNA sequence of each individual in the study. As a result, a large fraction of Single Nucleotide Polymorphisms (SNPs) are not genotyped. Existing imputation methods are based on individual level data, which are often time consuming and costly. A new method, the Minimum Deviation of Conditional Probability (MiDCoP), was recently developed that aims at imputing the allele frequencies of the missing SNPs using the allele frequencies of neighboring SNPs without using the individual level SNP information. This article studies the performance of the MiDCoP approach using association analysis based on the imputed allele frequency by analyzing the GAIN Schizophrenia data. The results indicate that the choice of reference sets has strong impact on the performance. The imputation accuracy improves if the case and control data sets are imputed using a separate but better matched reference set, respectively.

Original languageEnglish
Pages (from-to)57-67
Number of pages11
JournalStudies in Computational Intelligence
Volume569
DOIs
StatePublished - 2015

Keywords

  • Association Tests
  • Conditional Probability
  • Imputation
  • Minimum Deviation
  • Multilocus Information Measure
  • Single Nucleotide Polymorphisms

Fingerprint

Dive into the research topics of 'An evaluation of the MiDCoP method for imputing allele frequency in genome wide association studies'. Together they form a unique fingerprint.

Cite this