A study on studies: Exploring the metadata associated with dbGaP studies

Karen Truong, Mike Conway

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The database of Genotypes and Phenotypes (dbGaP) was developed by the National Heart Lung, and Blood Institute (NHLBI) to archive genome-wide association studies (GWAS) data. As of July 17th 2012, dbGaP contained 305 top-level studies. The metadata for each study (available from the dbGaP website) are organized into distinct sections, including a study description, inclusion/exclusion criteria, policies for authorized access requests, MeSH terms, PubMed identifiers, study histories, and the names of principal and co-investigators. We here tabulate the salient characteristics of dbGaP metadata as part of the Phenotype Discoverer (PhD) project, a research project at the University of California San Diego Division of Biomedical Informatics which aims to enhance the "searchability" of the current dbGaP website through the alignment of phenotypes to a standard information model. In particular, we are interested in using the extracted metadata PubMed identifiers, principal investigator names, associated journal names, etc. - as input to a statistical text classification algorithm, which will allow us to assign new dbGaP studies into pre-determined classes (e.g. heart, lung, blood, sleep) and improve the dbGaP user experience. This abstract reports the results of our analysis of current dbGaP metadata.

Original languageEnglish
Title of host publicationProceedings - 2012 IEEE 2nd Conference on Healthcare Informatics, Imaging and Systems Biology, HISB 2012
Pages126
Number of pages1
DOIs
StatePublished - 2012
Externally publishedYes
Event2012 IEEE 2nd Conference on Healthcare Informatics, Imaging and Systems Biology, HISB 2012 - San Diego, CA, United States
Duration: Sep 27 2012Sep 28 2012

Publication series

NameProceedings - 2012 IEEE 2nd Conference on Healthcare Informatics, Imaging and Systems Biology, HISB 2012

Conference

Conference2012 IEEE 2nd Conference on Healthcare Informatics, Imaging and Systems Biology, HISB 2012
Country/TerritoryUnited States
CitySan Diego, CA
Period09/27/1209/28/12

Fingerprint

Dive into the research topics of 'A study on studies: Exploring the metadata associated with dbGaP studies'. Together they form a unique fingerprint.

Cite this