You are here

Location bias of identifiers in clinical narratives.

TitleLocation bias of identifiers in clinical narratives.
Publication TypeJournal Article
Year of Publication2013
AuthorsHanauer, DA, Mei, Q, Malin, B, Zheng, K
JournalAMIA Annu Symp Proc
Date Published2013
KeywordsComputer Security, Confidentiality, Electronic Health Records, Health Insurance Portability and Accountability Act, Humans, Medical Records Systems, Computerized, Narration, United States

Scrubbing identifying information from narrative clinical documents is a critical first step to preparing the data for secondary use purposes, such as translational research. Evidence suggests that the differential distribution of protected health information (PHI) in clinical documents could be used as additional features to improve the performance of automated de-identification algorithms or toolkits. However, there has been little investigation into the extent to which such phenomena transpires in practice. To empirically assess this issue, we identified the location of PHI in 140,000 clinical notes from an electronic health record system and characterized the distribution as a function of location in a document. In addition, we calculated the 'word proximity' of nearby PHI elements to determine their co-occurrence rates. The PHI elements were found to have non-random distribution patterns. Location within a document and proximity between PHI elements might therefore be used to help de-identification systems better label PHI.

Alternate JournalAMIA Annu Symp Proc
PubMed ID24551358
PubMed Central IDPMC3900199
Grant List1R01LM011366 / LM / NLM NIH HHS / United States
1U01HG006385 / HG / NHGRI NIH HHS / United States
UL1TR000433 / TR / NCATS NIH HHS / United States
David Hanauer
University of Michigan Comprehensive Cancer Center at North Campus Reserach Complex
1600 Huron Parkway, Bldg 100, Rm 100 
Mailing Address: 2800 Plymouth Rd, NCRC 100-1004
Ann Arbor, MI 48109-2800 
Ph. (734) 764-8848 Fax. (734) 615-0517
Please acknowledge the Cancer Center Support Grant (P30 CA046592) when publishing manuscripts or abstracts that utilized the services of the University of Michigan's Comprehensive Cancer Center's Shared Resource: Cancer Informatics.
Suggested language: "Research reported in this [publication/press release] was supported by the National Cancer Institute of the National Institutes of Health under award number P30CA046592."

Copyright © Cancer Center Informatics-2011 Regents of the University of Michigan