Identifying synonymy between SNOMED clinical terms of varying length using distributional analysis of electronic health records.

Aron Henriksson, Mike Conway, Martin Duneld, Wendy W. Chapman

Research output: Contribution to journalArticlepeer-review

23 Scopus citations

Abstract

Medical terminologies and ontologies are important tools for natural language processing of health record narratives. To account for the variability of language use, synonyms need to be stored in a semantic resource as textual instantiations of a concept. Developing such resources manually is, however, prohibitively expensive and likely to result in low coverage. To facilitate and expedite the process of lexical resource development, distributional analysis of large corpora provides a powerful data-driven means of (semi-)automatically identifying semantic relations, including synonymy, between terms. In this paper, we demonstrate how distributional analysis of a large corpus of electronic health records - the MIMIC-II database - can be employed to extract synonyms of SNOMED CT preferred terms. A distinctive feature of our method is its ability to identify synonymous relations between terms of varying length.

Original languageEnglish
Pages (from-to)600-609
Number of pages10
JournalAMIA ... Annual Symposium proceedings. AMIA Symposium
Volume2013
StatePublished - 2013
Externally publishedYes

Fingerprint

Dive into the research topics of 'Identifying synonymy between SNOMED clinical terms of varying length using distributional analysis of electronic health records.'. Together they form a unique fingerprint.

Cite this