Int J Med Inform - Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs.

Tópicos

{ extract(1171) text(1153) clinic(932) }
{ control(1307) perform(991) simul(935) }
{ care(1570) inform(1187) nurs(1089) }
{ model(3404) distribut(989) bayesian(671) }
{ data(3008) multipl(1320) sourc(1022) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ framework(1458) process(801) describ(734) }
{ studi(1410) differ(1259) use(1210) }
{ perform(999) metric(946) measur(919) }
{ first(2504) two(1366) second(1323) }
{ time(1939) patient(1703) rate(768) }
{ use(1733) differ(960) four(931) }
{ measur(2081) correl(1212) valu(896) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ concept(1167) ontolog(924) domain(897) }
{ system(1050) medic(1026) inform(1018) }
{ activ(1138) subject(705) human(624) }
{ structur(1116) can(940) graph(676) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ featur(3375) classif(2383) classifi(1994) }
{ problem(2511) optim(1539) algorithm(950) }
{ algorithm(1844) comput(1787) effici(935) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ howev(809) still(633) remain(590) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ compound(1573) activ(1297) structur(1058) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ result(1111) use(1088) new(759) }
{ decis(3086) make(1611) patient(1517) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ clinic(1479) use(1117) guidelin(835) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ intervent(3218) particip(2042) group(1664) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

RPOSE: We describe an experiment to build a de-identification system for clinical records using the open source MITRE Identification Scrubber Toolkit (MIST). We quantify the human annotation effort needed to produce a system that de-identifies at high accuracy.METHODS: Using two types of clinical records (history and physical notes, and social work notes), we iteratively built statistical de-identification models by annotating 10 notes, training a model, applying the model to another 10 notes, correcting the model's output, and training from the resulting larger set of annotated notes. This was repeated for 20 rounds of 10 notes each, and then an additional 6 rounds of 20 notes each, and a final round of 40 notes. At each stage, we measured precision, recall, and F-score, and compared these to the amount of annotation time needed to complete the round.RESULTS: After the initial 10-note round (33min of annotation time) we achieved an F-score of 0.89. After just over 8h of annotation time (round 21) we achieved an F-score of 0.95. Number of annotation actions needed, as well as time needed, decreased in later rounds as model performance improved. Accuracy on history and physical notes exceeded that of social work notes, suggesting that the wider variety and contexts for protected health information (PHI) in social work notes is more difficult to model.CONCLUSIONS: It is possible, with modest effort, to build a functioning de-identification system de novo using the MIST framework. The resulting system achieved performance comparable to other high-performing de-identification systems.

Resumo Limpo

rpose describ experi build deidentif system clinic record use open sourc mitr identif scrubber toolkit mist quantifi human annot effort need produc system deidentifi high accuracymethod use two type clinic record histori physic note social work note iter built statist deidentif model annot note train model appli model anoth note correct model output train result larger set annot note repeat round note addit round note final round note stage measur precis recal fscore compar amount annot time need complet roundresult initi note round min annot time achiev fscore just h annot time round achiev fscore number annot action need well time need decreas later round model perform improv accuraci histori physic note exceed social work note suggest wider varieti context protect health inform phi social work note difficult modelconclus possibl modest effort build function deidentif system de novo use mist framework result system achiev perform compar highperform deidentif system

Resumos Similares

J Biomed Inform - MedTime: a temporal information extraction system for clinical narratives. ( 0,870821336448604 )
J Am Med Inform Assoc - Assisted annotation of medical free text using RapTAT. ( 0,870644352640113 )
J Am Med Inform Assoc - An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. ( 0,860647621443251 )
AMIA Annu Symp Proc - Inter-annotator reliability of medical events, coreferences and temporal relations in clinical narratives by annotators with varying levels of clinical expertise. ( 0,856511831358591 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,832263734860631 )
J Biomed Inform - Identifying non-elliptical entity mentions in a coordinated NP with ellipses. ( 0,827695011983688 )
J Biomed Inform - Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. ( 0,827001791016776 )
J Biomed Inform - Towards generating a patient's timeline: extracting temporal relationships from clinical notes. ( 0,826559167571495 )
J Am Med Inform Assoc - Eventual situations for timeline extraction from clinical reports. ( 0,824310123368129 )
J Med Syst - Redactable signatures for signed CDA Documents. ( 0,824089443269245 )
J Am Med Inform Assoc - A hybrid system for temporal information extraction from clinical text. ( 0,822168071092438 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,820363845972211 )
Appl Clin Inform - Representation of information about family relatives as structured data in electronic health records. ( 0,818957835077487 )
J Am Med Inform Assoc - Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text. ( 0,817574103399962 )
J Biomed Inform - Anaphoric reference in clinical reports: characteristics of an annotated corpus. ( 0,815837857043009 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,811590167756995 )
J Am Med Inform Assoc - Anaphoric relations in the clinical narrative: corpus creation. ( 0,809578554000275 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,807130813860602 )
J Am Med Inform Assoc - A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. ( 0,806360547704906 )
J Am Med Inform Assoc - Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. ( 0,805846073599295 )
AMIA Annu Symp Proc - Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining. ( 0,804210871347124 )
J. Med. Internet Res. - Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. ( 0,802427603126896 )
J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,801920807746459 )
J Biomed Inform - A new clustering method for detecting rare senses of abbreviations in clinical notes. ( 0,800066631159388 )
AMIA Annu Symp Proc - Mapping annotations with textual evidence using an scLDA model. ( 0,799789154308331 )
J Biomed Inform - A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora. ( 0,793946182493092 )
J Biomed Inform - UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text. ( 0,793540982158118 )
AMIA Annu Symp Proc - A cloud-based approach to medical NLP. ( 0,793388187127885 )
J Am Med Inform Assoc - Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. ( 0,792927879535152 )
J Am Med Inform Assoc - MedXN: an open source medication extraction and normalization tool for clinical text. ( 0,789969369128519 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,788227699767184 )
J Biomed Inform - Desiderata for ontologies to be used in semantic annotation of biomedical documents. ( 0,78761966161519 )
AMIA Annu Symp Proc - Automatically pairing measured findings across narrative abdomen CT reports. ( 0,786646681029924 )
J Integr Bioinform - Automatic extraction of microorganisms and their habitats from free text using text mining workflows. ( 0,785485775085574 )
J Am Med Inform Assoc - Hybrid methods for improving information access in clinical documents: concept, assertion, and relation identification. ( 0,78527165246298 )
AMIA Annu Symp Proc - Natural language processing for lines and devices in portable chest x-rays. ( 0,781774373570259 )
Int J Med Inform - Detecting temporal expressions in medical narratives. ( 0,779499533209628 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,779303291073806 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,779097626559145 )
J Biomed Inform - Extraction of events and temporal expressions from clinical narratives. ( 0,778875810403158 )
Comput Methods Programs Biomed - BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments. ( 0,778838064568287 )
AMIA Annu Symp Proc - Pharmacovigilance on twitter? Mining tweets for adverse drug reactions. ( 0,778686655800459 )
IEEE Trans Pattern Anal Mach Intell - Toward Integrated Scene Text Reading. ( 0,775689214786085 )
AMIA Annu Symp Proc - Voice-dictated versus typed-in clinician notes: linguistic properties and the potential implications on natural language processing. ( 0,775376777586553 )
Comput Math Methods Med - Ranking biomedical annotations with annotator's semantic relevancy. ( 0,774317302029794 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,770330252639103 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,76971350358454 )
J Biomed Inform - Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. ( 0,769341460932208 )
J Biomed Inform - Semantator: semantic annotator for converting biomedical text to linked data. ( 0,767542246119958 )
J Biomed Inform - Lexical patterns, features and knowledge resources for coreference resolution in clinical notes. ( 0,766828952470288 )
J Am Med Inform Assoc - Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries. ( 0,764486723213292 )
J Am Med Inform Assoc - Using rule-based natural language processing to improve disease normalization in biomedical text. ( 0,763910159250817 )
Int J Med Inform - A methodology to enhance spatial understanding of disease outbreak events reported in news articles. ( 0,761223310027611 )
Int J Med Inform - Detection of infectious symptoms from VA emergency department and primary care clinical documentation. ( 0,760662077350104 )
J Am Med Inform Assoc - A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. ( 0,75955033838459 )
AMIA Annu Symp Proc - Extracting patient demographics and personal medical information from online health forums. ( 0,75939971056921 )
BMC Med Inform Decis Mak - Text summarization as a decision support aid. ( 0,758492784648684 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,757715976956784 )
AMIA Annu Symp Proc - Extracting Concepts Related to Homelessness from the Free Text of VA Electronic Medical Records. ( 0,757706321549685 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,756311572712743 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,75618708085061 )
Brief. Bioinformatics - A survey on annotation tools for the biomedical literature. ( 0,755391928375538 )
AMIA Annu Symp Proc - Building gold standard corpora for medical natural language processing tasks. ( 0,754293651028305 )
J Am Med Inform Assoc - ccML, a new mark-up language to improve ISO/EN 13606-based electronic health record extracts practical edition. ( 0,752603580126552 )
AMIA Annu Symp Proc - Towards a semantic lexicon for clinical natural language processing. ( 0,751993629394956 )
AMIA Annu Symp Proc - Automated illustration of patients instructions. ( 0,751597332255633 )
J Biomed Inform - Automatically extracting information needs from complex clinical questions. ( 0,749070895559318 )
J Am Med Inform Assoc - Large-scale evaluation of automated clinical note de-identification and its impact on information extraction. ( 0,748604728616594 )
AMIA Annu Symp Proc - Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations. ( 0,746609009634855 )
J Biomed Inform - NCBI disease corpus: a resource for disease name recognition and concept normalization. ( 0,745529224831935 )
J Biomed Inform - Degree centrality for semantic abstraction summarization of therapeutic studies. ( 0,745512052734462 )
Appl Clin Inform - Comparing the effectiveness of computerized adverse drug event monitoring systems to enhance clinical decision support for hospitalized patients. ( 0,744945507161474 )
J Biomed Inform - Common data model for natural language processing based on two existing standard information models: CDA+GrAF. ( 0,743537196740218 )
J Biomed Inform - Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. ( 0,742981883169818 )
Sci Data - Building the graph of medicine from millions of clinical narratives. ( 0,742969575127407 )
AMIA Annu Symp Proc - Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method. ( 0,741292473018039 )
J Biomed Inform - Text de-identification for privacy protection: a study of its impact on clinical text information content. ( 0,740984055080178 )
J Am Med Inform Assoc - Towards comprehensive syntactic and semantic annotations of the clinical narrative. ( 0,740489741946395 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,740220816834602 )
AMIA Annu Symp Proc - Critical finding capture in the impression section of radiology reports. ( 0,737699147477732 )
J Biomed Inform - Annotating temporal information in clinical narratives. ( 0,736421904591726 )
J Am Med Inform Assoc - Automatic abstraction of imaging observations with their characteristics from mammography reports. ( 0,735095156876058 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,73334297201307 )
J Biomed Inform - Approaches to verb subcategorization for biomedicine. ( 0,730222202388113 )
J Biomed Inform - Text summarization in the biomedical domain: a systematic review of recent research. ( 0,728691930702382 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,728353267357107 )
J Am Med Inform Assoc - The effect of word familiarity on actual and perceived text difficulty. ( 0,727134537139765 )
Appl Clin Inform - Clinical communication in diagnostic imaging studies: mixed-method study of pre- and post-implementation of a hospital information system. ( 0,726886725485822 )
AMIA Annu Symp Proc - Discovering peripheral arterial disease cases from radiology notes using natural language processing. ( 0,723317894672831 )
J Am Med Inform Assoc - A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge. ( 0,722060662617201 )
AMIA Annu Symp Proc - Qualitative analysis of workflow modifications used to generate the reference standard for the 2010 i2b2/VA challenge. ( 0,721164923391169 )
AMIA Annu Symp Proc - Generalizability and comparison of automatic clinical text de-identification methods and resources. ( 0,716780599834759 )
AMIA Annu Symp Proc - A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries. ( 0,716041437657756 )
Perspect Health Inf Manag - A comparison of two approaches to text processing: facilitating chart reviews of radiology reports in electronic medical records. ( 0,714524258259071 )
J Am Med Inform Assoc - Automated clinical trial eligibility prescreening: increasing the efficiency of patient identification for clinical trials in the emergency department. ( 0,71209275532773 )
J. Med. Internet Res. - Evaluating a web-based clinical decision support system for language disorders screening in a nursery school. ( 0,711750260842149 )
AMIA Annu Symp Proc - Document clustering of clinical narratives: a systematic study of clinical sublanguages. ( 0,711328426282888 )
AMIA Annu Symp Proc - A machine learning approach for identifying anatomical locations of actionable findings in radiology reports. ( 0,710929971955625 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,710511429900364 )
J Am Med Inform Assoc - Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences. ( 0,708548894205487 )