AMIA Annu Symp Proc - Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method.

Tópicos

{ extract(1171) text(1153) clinic(932) }
{ concept(1167) ontolog(924) domain(897) }
{ patient(2837) hospit(1953) medic(668) }
{ featur(3375) classif(2383) classifi(1994) }
{ perform(999) metric(946) measur(919) }
{ can(774) often(719) complex(702) }
{ process(1125) use(805) approach(778) }
{ method(2212) result(1239) propos(1039) }
{ patient(2315) diseas(1263) diabet(1191) }
{ gene(2352) biolog(1181) express(1162) }
{ analysi(2126) use(1163) compon(1037) }
{ treatment(1704) effect(941) patient(846) }
{ general(901) number(790) one(736) }
{ use(976) code(926) identifi(902) }
{ result(1111) use(1088) new(759) }
{ decis(3086) make(1611) patient(1517) }
{ system(1976) rule(880) can(841) }
{ method(1219) similar(1157) match(930) }
{ imag(2675) segment(2577) method(1081) }
{ motion(1329) object(1292) video(1091) }
{ problem(2511) optim(1539) algorithm(950) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ studi(1119) effect(1106) posit(819) }
{ record(1888) medic(1808) patient(1693) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ age(1611) year(1155) adult(843) }
{ data(3008) multipl(1320) sourc(1022) }
{ activ(1138) subject(705) human(624) }
{ structur(1116) can(940) graph(676) }
{ implement(1333) system(1263) develop(1122) }
{ method(1969) cluster(1462) data(1082) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ assess(1506) score(1403) qualiti(1306) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ research(1218) medic(880) student(794) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ activ(1452) weight(1219) physic(1104) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Semantic lexicons that link words and phrases to specific semantic types such as diseases are valuable assets for clinical natural language processing (NLP) systems. Although terminological terms with predefined semantic types can be generated easily from existing knowledge bases such as the Unified Medical Language Systems (UMLS), they are often limited and do not have good coverage for narrative clinical text. In this study, we developed a method for building semantic lexicons from clinical corpus. It extracts candidate semantic terms using a conditional random field (CRF) classifier and then selects terms using the C-Value algorithm. We applied the method to a corpus containing 10 years of discharge summaries from Vanderbilt University Hospital (VUH) and extracted 44,957 new terms for three semantic groups: Problem, Treatment, and Test. A manual analysis of 200 randomly selected terms not found in the UMLS demonstrated that 59% of them were meaningful new clinical concepts and 25% were lexical variants of exiting concepts in the UMLS. Furthermore, we compared the effectiveness of corpus-derived and UMLS-derived semantic lexicons in the concept extraction task of the 2010 i2b2 clinical NLP challenge. Our results showed that the classifier with corpus-derived semantic lexicons as features achieved a better performance (F-score 82.52%) than that with UMLS-derived semantic lexicons as features (F-score 82.04%). We conclude that such corpus-based methods are effective for generating semantic lexicons, which may improve named entity recognition tasks and may aid in augmenting synonymy within existing terminologies.

Resumo Limpo

semant lexicon link word phrase specif semant type diseas valuabl asset clinic natur languag process nlp system although terminolog term predefin semant type can generat easili exist knowledg base unifi medic languag system uml often limit good coverag narrat clinic text studi develop method build semant lexicon clinic corpus extract candid semant term use condit random field crf classifi select term use cvalu algorithm appli method corpus contain year discharg summari vanderbilt univers hospit vuh extract new term three semant group problem treatment test manual analysi random select term found uml demonstr meaning new clinic concept lexic variant exit concept uml furthermor compar effect corpusderiv umlsderiv semant lexicon concept extract task ib clinic nlp challeng result show classifi corpusderiv semant lexicon featur achiev better perform fscore umlsderiv semant lexicon featur fscore conclud corpusbas method effect generat semant lexicon may improv name entiti recognit task may aid augment synonymi within exist terminolog

Resumos Similares

J Biomed Inform - Ontology-guided feature engineering for clinical text classification. ( 0,878277009040217 )
J Am Med Inform Assoc - Anaphoric relations in the clinical narrative: corpus creation. ( 0,849052750852734 )
Appl Clin Inform - Representation of information about family relatives as structured data in electronic health records. ( 0,8467847149297 )
J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,838292139030052 )
AMIA Annu Symp Proc - Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining. ( 0,834076618117453 )
J Biomed Inform - Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. ( 0,827829518392133 )
J Biomed Inform - UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text. ( 0,827634616561963 )
J Biomed Inform - Ontology modularization to improve semantic medical image annotation. ( 0,824162412698095 )
J Biomed Inform - Desiderata for ontologies to be used in semantic annotation of biomedical documents. ( 0,821966454265281 )
J Am Med Inform Assoc - Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text. ( 0,820968939785263 )
J Am Med Inform Assoc - Eventual situations for timeline extraction from clinical reports. ( 0,820905681923241 )
J Biomed Inform - Anaphoric reference in clinical reports: characteristics of an annotated corpus. ( 0,818515968119788 )
AMIA Annu Symp Proc - A cloud-based approach to medical NLP. ( 0,81501980469293 )
J Am Med Inform Assoc - Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. ( 0,81464725148301 )
J Biomed Inform - Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. ( 0,813768369530705 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,811429353288973 )
J Am Med Inform Assoc - An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. ( 0,809718958406402 )
J Biomed Inform - Identifying non-elliptical entity mentions in a coordinated NP with ellipses. ( 0,805141337655527 )
J Biomed Inform - Lexical patterns, features and knowledge resources for coreference resolution in clinical notes. ( 0,80090404234404 )
AMIA Annu Symp Proc - Inter-annotator reliability of medical events, coreferences and temporal relations in clinical narratives by annotators with varying levels of clinical expertise. ( 0,799933123725635 )
J. Med. Internet Res. - Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. ( 0,79702731166799 )
AMIA Annu Symp Proc - Automatic acquisition of sublanguage semantic schema: towards the word sense disambiguation of clinical narratives. ( 0,793510065461611 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,79176606661639 )
J Biomed Inform - A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora. ( 0,791600636937989 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,790510518011238 )
J Biomed Inform - MedTime: a temporal information extraction system for clinical narratives. ( 0,789835041921217 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,789434356615307 )
J Am Med Inform Assoc - Automated clinical trial eligibility prescreening: increasing the efficiency of patient identification for clinical trials in the emergency department. ( 0,787722113490729 )
AMIA Annu Symp Proc - Extracting Concepts Related to Homelessness from the Free Text of VA Electronic Medical Records. ( 0,787446172511545 )
J Am Med Inform Assoc - Towards comprehensive syntactic and semantic annotations of the clinical narrative. ( 0,786140825095217 )
AMIA Annu Symp Proc - On-time clinical phenotype prediction based on narrative reports. ( 0,783559111240906 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,782964534180671 )
AMIA Annu Symp Proc - EpiDEA: extracting structured epilepsy and seizure information from patient discharge summaries for cohort identification. ( 0,782474763504681 )
J Biomed Inform - Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. ( 0,781545801858294 )
J Am Med Inform Assoc - A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. ( 0,781425285998915 )
J Biomed Inform - Towards generating a patient's timeline: extracting temporal relationships from clinical notes. ( 0,779946747800368 )
AMIA Annu Symp Proc - Automatically pairing measured findings across narrative abdomen CT reports. ( 0,777693242489256 )
AMIA Annu Symp Proc - An evaluation of the UMLS in representing corpus derived clinical concepts. ( 0,774092513150908 )
J Am Med Inform Assoc - A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. ( 0,774041788139163 )
AMIA Annu Symp Proc - Towards a semantic lexicon for clinical natural language processing. ( 0,772440664303209 )
J Biomed Inform - Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. ( 0,771902370460832 )
J Biomed Inform - Extraction of events and temporal expressions from clinical narratives. ( 0,771857552215819 )
AMIA Annu Symp Proc - Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to process medication information in outpatient clinical notes. ( 0,771374917198157 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,771094691819235 )
J Am Med Inform Assoc - A hybrid system for temporal information extraction from clinical text. ( 0,76867740132034 )
J Am Med Inform Assoc - Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. ( 0,768164947853369 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,767039453490777 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,764861927179641 )
J Med Syst - Redactable signatures for signed CDA Documents. ( 0,76469736081165 )
J Am Med Inform Assoc - Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. ( 0,759985802326739 )
J Biomed Inform - A new clustering method for detecting rare senses of abbreviations in clinical notes. ( 0,758029822911031 )
Int J Med Inform - Detecting temporal expressions in medical narratives. ( 0,75794889389279 )
AMIA Annu Symp Proc - Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations. ( 0,756346804644545 )
AMIA Annu Symp Proc - Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature. ( 0,754182816564419 )
AMIA Annu Symp Proc - Natural language processing for lines and devices in portable chest x-rays. ( 0,754045919939448 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,752190840949408 )
AMIA Annu Symp Proc - Mapping annotations with textual evidence using an scLDA model. ( 0,751743340686544 )
J Am Med Inform Assoc - Hybrid methods for improving information access in clinical documents: concept, assertion, and relation identification. ( 0,75090922528103 )
Brief. Bioinformatics - A survey on annotation tools for the biomedical literature. ( 0,7502295473836 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,749898498453937 )
J Biomed Inform - NCBI disease corpus: a resource for disease name recognition and concept normalization. ( 0,749587453129468 )
AMIA Annu Symp Proc - Building gold standard corpora for medical natural language processing tasks. ( 0,749234612025098 )
J Am Med Inform Assoc - MedXN: an open source medication extraction and normalization tool for clinical text. ( 0,748713950715146 )
AMIA Annu Symp Proc - Voice-dictated versus typed-in clinician notes: linguistic properties and the potential implications on natural language processing. ( 0,747938938069586 )
Int J Med Inform - Detection of infectious symptoms from VA emergency department and primary care clinical documentation. ( 0,747905979627863 )
J Biomed Inform - Common data model for natural language processing based on two existing standard information models: CDA+GrAF. ( 0,744777608170911 )
J Biomed Inform - Semantator: semantic annotator for converting biomedical text to linked data. ( 0,744776120876967 )
J Integr Bioinform - Automatic extraction of microorganisms and their habitats from free text using text mining workflows. ( 0,744490398132226 )
Int J Med Inform - Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. ( 0,741292473018039 )
J Am Med Inform Assoc - Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. ( 0,741181726768935 )
J Am Med Inform Assoc - Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification. ( 0,740588502914776 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,738605441022716 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,737183281773458 )
Comput Math Methods Med - Ranking biomedical annotations with annotator's semantic relevancy. ( 0,736373990639125 )
J Am Med Inform Assoc - Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries. ( 0,735872999343749 )
AMIA Annu Symp Proc - Using ontology network structure in text mining. ( 0,733724458745228 )
Artif Intell Med - Terminological resources for text mining over biomedical scientific literature. ( 0,731506622531928 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,731453079523133 )
AMIA Annu Symp Proc - Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing. ( 0,730996910762183 )
AMIA Annu Symp Proc - Pharmacovigilance on twitter? Mining tweets for adverse drug reactions. ( 0,728403427549567 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,725142483269773 )
BMC Med Inform Decis Mak - Text summarization as a decision support aid. ( 0,724724190763191 )
AMIA Annu Symp Proc - A Knowledge Intensive Approach to Mapping Clinical Narrative to LOINC. ( 0,724602327595872 )
J Biomed Inform - Text de-identification for privacy protection: a study of its impact on clinical text information content. ( 0,724183875011316 )
AMIA Annu Symp Proc - Extracting patient demographics and personal medical information from online health forums. ( 0,720975100253991 )
J Am Med Inform Assoc - Assisted annotation of medical free text using RapTAT. ( 0,719806915031452 )
J Am Med Inform Assoc - The effect of word familiarity on actual and perceived text difficulty. ( 0,719712415083122 )
Comput Methods Programs Biomed - BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments. ( 0,71784698243966 )
J Am Med Inform Assoc - Developing and evaluating an automated appendicitis risk stratification algorithm for pediatric patients in the emergency department. ( 0,713882522979943 )
J Am Med Inform Assoc - Automatic abstraction of imaging observations with their characteristics from mammography reports. ( 0,713190919116137 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,709794387914559 )
AMIA Annu Symp Proc - Risk stratification of ICU patients using topic models inferred from unstructured progress notes. ( 0,708963998024617 )
Sci Data - Building the graph of medicine from millions of clinical narratives. ( 0,707763771937682 )
J Am Med Inform Assoc - A classification approach to coreference in discharge summaries: 2011 i2b2 challenge. ( 0,705928885937465 )
J Biomed Inform - Automatically extracting information needs from complex clinical questions. ( 0,704911086073723 )
AMIA Annu Symp Proc - It's about this and that: a description of anaphoric expressions in clinical text. ( 0,70435069876353 )
J Am Med Inform Assoc - BT-Nurse: computer generation of natural language shift summaries from complex heterogeneous medical data. ( 0,704278905478395 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,703973791281079 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,703900588950033 )
J Am Med Inform Assoc - Extracting drug indication information from structured product labels using natural language processing. ( 0,701812034754999 )