J Biomed Inform - Text de-identification for privacy protection: a study of its impact on clinical text information content.

Tópicos

{ extract(1171) text(1153) clinic(932) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ visual(1396) interact(850) tool(830) }
{ inform(2794) health(2639) internet(1427) }
{ chang(1828) time(1643) increas(1301) }
{ process(1125) use(805) approach(778) }
{ system(1976) rule(880) can(841) }
{ studi(1410) differ(1259) use(1210) }
{ patient(2315) diseas(1263) diabet(1191) }
{ system(1050) medic(1026) inform(1018) }
{ ehr(2073) health(1662) electron(1139) }
{ sampl(1606) size(1419) use(1276) }
{ concept(1167) ontolog(924) domain(897) }
{ data(1714) softwar(1251) tool(1186) }
{ import(1318) role(1303) understand(862) }
{ record(1888) medic(1808) patient(1693) }
{ research(1218) medic(880) student(794) }
{ high(1669) rate(1365) level(1280) }
{ survey(1388) particip(1329) question(1065) }
{ activ(1452) weight(1219) physic(1104) }
{ imag(1057) registr(996) error(939) }
{ take(945) account(800) differ(722) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ general(901) number(790) one(736) }
{ data(3963) clinic(1234) research(1004) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ patient(2837) hospit(1953) medic(668) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ use(976) code(926) identifi(902) }
{ result(1111) use(1088) new(759) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ learn(2355) train(1041) set(1003) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ research(1085) discuss(1038) issu(1018) }
{ model(2341) predict(2261) use(1141) }
{ compound(1573) activ(1297) structur(1058) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ state(1844) use(1261) util(961) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

As more and more electronic clinical information is becoming easier to access for secondary uses such as clinical research, approaches that enable faster and more collaborative research while protecting patient privacy and confidentiality are becoming more important. Clinical text de-identification offers such advantages but is typically a tedious manual process. Automated Natural Language Processing (NLP) methods can alleviate this process, but their impact on subsequent uses of the automatically de-identified clinical narratives has only barely been investigated. In the context of a larger project to develop and investigate automated text de-identification for Veterans Health Administration (VHA) clinical notes, we studied the impact of automated text de-identification on clinical information in a stepwise manner. Our approach started with a high-level assessment of clinical notes informativeness and formatting, and ended with a detailed study of the overlap of select clinical information types and Protected Health Information (PHI). To investigate the informativeness (i.e., document type information, select clinical data types, and interpretation or conclusion) of VHA clinical notes, we used five different existing text de-identification systems. The informativeness was only minimally altered by these systems while formatting was only modified by one system. To examine the impact of de-identification on clinical information extraction, we compared counts of SNOMED-CT concepts found by an open source information extraction application in the original (i.e., not de-identified) version of a corpus of VHA clinical notes, and in the same corpus after de-identification. Only about 1.2-3% less SNOMED-CT concepts were found in de-identified versions of our corpus, and many of these concepts were PHI that was erroneously identified as clinical information. To study this impact in more details and assess how generalizable our findings were, we examined the overlap between select clinical information annotated in the 2010 i2b2 NLP challenge corpus and automatic PHI annotations from our best-of-breed VHA clinical text de-identification system (nicknamed 'BoB'). Overall, only 0.81% of the clinical information exactly overlapped with PHI, and 1.78% partly overlapped. We conclude that automated text de-identification's impact on clinical information is small, but not negligible, and that improved clinical acronyms and eponyms disambiguation could significantly reduce this impact.

Resumo Limpo

electron clinic inform becom easier access secondari use clinic research approach enabl faster collabor research protect patient privaci confidenti becom import clinic text deidentif offer advantag typic tedious manual process autom natur languag process nlp method can allevi process impact subsequ use automat deidentifi clinic narrat bare investig context larger project develop investig autom text deidentif veteran health administr vha clinic note studi impact autom text deidentif clinic inform stepwis manner approach start highlevel assess clinic note inform format end detail studi overlap select clinic inform type protect health inform phi investig inform ie document type inform select clinic data type interpret conclus vha clinic note use five differ exist text deidentif system inform minim alter system format modifi one system examin impact deidentif clinic inform extract compar count snomedct concept found open sourc inform extract applic origin ie deidentifi version corpus vha clinic note corpus deidentif less snomedct concept found deidentifi version corpus mani concept phi erron identifi clinic inform studi impact detail assess generaliz find examin overlap select clinic inform annot ib nlp challeng corpus automat phi annot bestofbre vha clinic text deidentif system nicknam bob overal clinic inform exact overlap phi part overlap conclud autom text deidentif impact clinic inform small neglig improv clinic acronym eponym disambigu signific reduc impact

Resumos Similares

J Biomed Inform - Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. ( 0,926807135572703 )
J Am Med Inform Assoc - Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences. ( 0,900765399158391 )
J Am Med Inform Assoc - Anaphoric relations in the clinical narrative: corpus creation. ( 0,878768151529098 )
J. Med. Internet Res. - Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. ( 0,868760427744565 )
Int J Med Inform - Detecting temporal expressions in medical narratives. ( 0,86403251067257 )
J Am Med Inform Assoc - MedXN: an open source medication extraction and normalization tool for clinical text. ( 0,845718419662335 )
Appl Clin Inform - Representation of information about family relatives as structured data in electronic health records. ( 0,838861362393742 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,835359250988428 )
J Biomed Inform - Lexical patterns, features and knowledge resources for coreference resolution in clinical notes. ( 0,833689890693513 )
AMIA Annu Symp Proc - Inter-annotator reliability of medical events, coreferences and temporal relations in clinical narratives by annotators with varying levels of clinical expertise. ( 0,83338138541213 )
J Am Med Inform Assoc - An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. ( 0,829525858012776 )
J Am Med Inform Assoc - Eventual situations for timeline extraction from clinical reports. ( 0,823661361171618 )
J Biomed Inform - Anaphoric reference in clinical reports: characteristics of an annotated corpus. ( 0,822591241390052 )
J Biomed Inform - A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora. ( 0,820319952622933 )
AMIA Annu Symp Proc - Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining. ( 0,819169620818336 )
AMIA Annu Symp Proc - Automatically pairing measured findings across narrative abdomen CT reports. ( 0,816848737487872 )
J Am Med Inform Assoc - A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. ( 0,814410920025151 )
J Biomed Inform - MedTime: a temporal information extraction system for clinical narratives. ( 0,814125545984542 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,812501133744107 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,808995484045598 )
J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,808530085797859 )
J Am Med Inform Assoc - Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text. ( 0,80774636884298 )
AMIA Annu Symp Proc - Natural language processing for lines and devices in portable chest x-rays. ( 0,806579229925375 )
J Biomed Inform - Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. ( 0,805553459500638 )
J Biomed Inform - Towards generating a patient's timeline: extracting temporal relationships from clinical notes. ( 0,802549340495921 )
J Biomed Inform - Annotating temporal information in clinical narratives. ( 0,801843986576768 )
J Am Med Inform Assoc - A hybrid system for temporal information extraction from clinical text. ( 0,801631755727582 )
Comput Math Methods Med - Ranking biomedical annotations with annotator's semantic relevancy. ( 0,798895418415507 )
J Am Med Inform Assoc - Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. ( 0,795227790622233 )
AMIA Annu Symp Proc - Automated illustration of patients instructions. ( 0,794891801186743 )
J Am Med Inform Assoc - Towards comprehensive syntactic and semantic annotations of the clinical narrative. ( 0,791501411063643 )
J Biomed Inform - Identifying non-elliptical entity mentions in a coordinated NP with ellipses. ( 0,790072413419569 )
J Med Syst - Redactable signatures for signed CDA Documents. ( 0,788700170272107 )
AMIA Annu Symp Proc - Towards a semantic lexicon for clinical natural language processing. ( 0,788582384830103 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,788205777152205 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,787653285261839 )
J Am Med Inform Assoc - Assisted annotation of medical free text using RapTAT. ( 0,786991798878617 )
Brief. Bioinformatics - A survey on annotation tools for the biomedical literature. ( 0,785693477273741 )
AMIA Annu Symp Proc - Mapping annotations with textual evidence using an scLDA model. ( 0,783101593605181 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,777935253748098 )
AMIA Annu Symp Proc - Building gold standard corpora for medical natural language processing tasks. ( 0,773695077558248 )
J Am Med Inform Assoc - Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. ( 0,772355074143178 )
Comput Methods Programs Biomed - BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments. ( 0,770842462043559 )
AMIA Annu Symp Proc - Pharmacovigilance on twitter? Mining tweets for adverse drug reactions. ( 0,768962813639331 )
J Integr Bioinform - Automatic extraction of microorganisms and their habitats from free text using text mining workflows. ( 0,768210451769055 )
AMIA Annu Symp Proc - Voice-dictated versus typed-in clinician notes: linguistic properties and the potential implications on natural language processing. ( 0,766409358913287 )
J Biomed Inform - NCBI disease corpus: a resource for disease name recognition and concept normalization. ( 0,765825672362393 )
AMIA Annu Symp Proc - Extracting Concepts Related to Homelessness from the Free Text of VA Electronic Medical Records. ( 0,764208983891428 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,763694685336377 )
J Biomed Inform - Extraction of events and temporal expressions from clinical narratives. ( 0,761669153335278 )
J Biomed Inform - UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text. ( 0,761573431159598 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,758943057023465 )
J Biomed Inform - A new clustering method for detecting rare senses of abbreviations in clinical notes. ( 0,754554741058574 )
J Biomed Inform - Automatically extracting information needs from complex clinical questions. ( 0,753814945699077 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,752483640238224 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,751904622649603 )
Int J Med Inform - Detection of infectious symptoms from VA emergency department and primary care clinical documentation. ( 0,750378786715517 )
J Am Med Inform Assoc - The effect of word familiarity on actual and perceived text difficulty. ( 0,748183246769717 )
J Am Med Inform Assoc - Using rule-based natural language processing to improve disease normalization in biomedical text. ( 0,748129741529019 )
Telemed J E Health - Information extraction for tracking liver cancer patients' statuses: from mixture of clinical narrative report types. ( 0,746316563237028 )
J Biomed Inform - Desiderata for ontologies to be used in semantic annotation of biomedical documents. ( 0,745517619445594 )
Sci Data - Building the graph of medicine from millions of clinical narratives. ( 0,745030474514786 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,742404892671693 )
AMIA Annu Symp Proc - Critical finding capture in the impression section of radiology reports. ( 0,742120799403709 )
Int J Med Inform - Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. ( 0,740984055080178 )
AMIA Annu Symp Proc - Sophia: A Expedient UMLS Concept Extraction Annotator. ( 0,739702080862075 )
J Am Med Inform Assoc - Large-scale evaluation of automated clinical note de-identification and its impact on information extraction. ( 0,738618765138401 )
J Am Med Inform Assoc - A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. ( 0,735653995934651 )
AMIA Annu Symp Proc - EpiDEA: extracting structured epilepsy and seizure information from patient discharge summaries for cohort identification. ( 0,735365531109339 )
BMC Med Inform Decis Mak - Text summarization as a decision support aid. ( 0,734332615689182 )
J Am Med Inform Assoc - A knowledge discovery and reuse pipeline for information extraction in clinical notes. ( 0,733183400643201 )
J Am Med Inform Assoc - Extracting drug indication information from structured product labels using natural language processing. ( 0,733084012516714 )
J Med Syst - Experiences with a PDA-based documentation system in clinical research. ( 0,731523893331306 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,730446536668961 )
J Biomed Inform - Relation mining experiments in the pharmacogenomics domain. ( 0,729804321152563 )
J Biomed Inform - The EU-ADR corpus: annotated drugs, diseases, targets, and their relationships. ( 0,726175548507925 )
J Biomed Inform - Text summarization in the biomedical domain: a systematic review of recent research. ( 0,724613070968952 )
AMIA Annu Symp Proc - Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method. ( 0,724183875011316 )
J Am Med Inform Assoc - Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries. ( 0,722278569205259 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,71956655851883 )
J Biomed Inform - Semantator: semantic annotator for converting biomedical text to linked data. ( 0,718314562122839 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,716420657887797 )
AMIA Annu Symp Proc - A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries. ( 0,716252846295649 )
AMIA Annu Symp Proc - Qualitative analysis of workflow modifications used to generate the reference standard for the 2010 i2b2/VA challenge. ( 0,715105875926062 )
J Chem Inf Model - Automated extraction of information on chemical-P-glycoprotein interactions from the literature. ( 0,713810804456317 )
Int J Med Inform - A methodology to enhance spatial understanding of disease outbreak events reported in news articles. ( 0,711731621424033 )
AMIA Annu Symp Proc - It's about this and that: a description of anaphoric expressions in clinical text. ( 0,710492963610038 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,709883702578905 )
J Biomed Inform - Secondary use of electronic health records for building cohort studies through top-down information extraction. ( 0,707361315610959 )
J Am Med Inform Assoc - Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. ( 0,707212075392821 )
AMIA Annu Symp Proc - A cloud-based approach to medical NLP. ( 0,706944246143977 )
AMIA Annu Symp Proc - Generalizability and comparison of automatic clinical text de-identification methods and resources. ( 0,706299921759505 )
J Biomed Inform - Ontology modularization to improve semantic medical image annotation. ( 0,703027537229552 )
Perspect Health Inf Manag - A comparison of two approaches to text processing: facilitating chart reviews of radiology reports in electronic medical records. ( 0,702560427004221 )
J Biomed Inform - Using an ensemble system to improve concept extraction from clinical records. ( 0,702081673172654 )
AMIA Annu Symp Proc - Active Learning-based corpus annotation--the PathoJen experience. ( 0,701021784484892 )
J Biomed Inform - Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. ( 0,700309160506823 )
J. Med. Internet Res. - Evaluating a web-based clinical decision support system for language disorders screening in a nursery school. ( 0,700232723714149 )
J Am Med Inform Assoc - Temporal reasoning over clinical text: the state of the art. ( 0,700009602310115 )
AMIA Annu Symp Proc - Extracting patient demographics and personal medical information from online health forums. ( 0,699427335243548 )