The allele frequencies of Single Nucleotide Polymorphisms (SNPs) are important summary information in case-control Genome Wide Association Studies (GWASs) for computing test statistics and allelic odds ratios, which are used to identify significant SNPs or for meta-analysis. Due to the limitation of time and cost, a large fraction of the known SNPs are not genotyped in the current genotyping platforms used in most of the GWASs. Imputation methods of untyped SNPs based on the individual level of genotyped data are powerful tools. However, these methods are computationally expensive and cannot work in cases where only the summary level information, such as the allele frequency is available. In this study, we propose an approach of imputing the allele frequency of untyped SNPs in the sample using only the allele frequency of the most informative pair of SNPs. We apply and compare five information measures as multilocus information measures to determine the most informative pair of SNPs to impute the allele frequency of untyped SNPs. Our approach is simple, yet highly accurate in estimating the allele frequency of untyped SNPs.
|Title of host publication||Proceedings - 3rd International Conference on Applied Computing and Information Technology and 2nd International Conference on Computational Science and Intelligence, ACIT-CSI 2015|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|State||Published - Nov 23 2015|