J Biomed Inform - Ontology-guided feature engineering for clinical text classification.

Tópicos

{ extract(1171) text(1153) clinic(932) }
{ featur(3375) classif(2383) classifi(1994) }
{ method(2212) result(1239) propos(1039) }
{ patient(2837) hospit(1953) medic(668) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ concept(1167) ontolog(924) domain(897) }
{ process(1125) use(805) approach(778) }
{ research(1218) medic(880) student(794) }
{ structur(1116) can(940) graph(676) }
{ perform(999) metric(946) measur(919) }
{ state(1844) use(1261) util(961) }
{ first(2504) two(1366) second(1323) }
{ activ(1452) weight(1219) physic(1104) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ assess(1506) score(1403) qualiti(1306) }
{ framework(1458) process(801) describ(734) }
{ data(1714) softwar(1251) tool(1186) }
{ general(901) number(790) one(736) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ data(2317) use(1299) case(1017) }
{ gene(2352) biolog(1181) express(1162) }
{ analysi(2126) use(1163) compon(1037) }
{ drug(1928) target(777) effect(648) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ imag(1057) registr(996) error(939) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

In this study we present novel feature engineering techniques that leverage the biomedical domain knowledge encoded in the Unified Medical Language System (UMLS) to improve machine-learning based clinical text classification. Critical steps in clinical text classification include identification of features and passages relevant to the classification task, and representation of clinical text to enable discrimination between documents of different classes. We developed novel information-theoretic techniques that utilize the taxonomical structure of the Unified Medical Language System (UMLS) to improve feature ranking, and we developed a semantic similarity measure that projects clinical text into a feature space that improves classification. We evaluated these methods on the 2008 Integrating Informatics with Biology and the Bedside (I2B2) obesity challenge. The methods we developed improve upon the results of this challenge's top machine-learning based system, and may improve the performance of other machine-learning based clinical text classification systems. We have released all tools developed as part of this study as open source, available at http://code.google.com/p/ytex.

Resumo Limpo

studi present novel featur engin techniqu leverag biomed domain knowledg encod unifi medic languag system uml improv machinelearn base clinic text classif critic step clinic text classif includ identif featur passag relev classif task represent clinic text enabl discrimin document differ class develop novel informationtheoret techniqu util taxonom structur unifi medic languag system uml improv featur rank develop semant similar measur project clinic text featur space improv classif evalu method integr informat biolog bedsid ib obes challeng method develop improv upon result challeng top machinelearn base system may improv perform machinelearn base clinic text classif system releas tool develop part studi open sourc avail httpcodegooglecompytex

Resumos Similares

AMIA Annu Symp Proc - Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method. ( 0,878277009040217 )
J Am Med Inform Assoc - Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. ( 0,818166962504772 )
AMIA Annu Symp Proc - Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing. ( 0,817002205505386 )
J Am Med Inform Assoc - A classification approach to coreference in discharge summaries: 2011 i2b2 challenge. ( 0,814847397419854 )
J Am Med Inform Assoc - Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text. ( 0,812922912506059 )
AMIA Annu Symp Proc - Automatically pairing measured findings across narrative abdomen CT reports. ( 0,803724905301986 )
AMIA Annu Symp Proc - Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining. ( 0,79618535096122 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,787326292607333 )
J Am Med Inform Assoc - Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. ( 0,783636189104313 )
J Am Med Inform Assoc - Anaphoric relations in the clinical narrative: corpus creation. ( 0,779983507904448 )
J Am Med Inform Assoc - An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. ( 0,778789865296548 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,776872596424401 )
J. Med. Internet Res. - Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. ( 0,773784833967199 )
J Am Med Inform Assoc - Eventual situations for timeline extraction from clinical reports. ( 0,773462731596145 )
Appl Clin Inform - Representation of information about family relatives as structured data in electronic health records. ( 0,770054969056038 )
J Am Med Inform Assoc - Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. ( 0,768203625074334 )
J Am Med Inform Assoc - Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. ( 0,759503485260065 )
AMIA Annu Symp Proc - Inter-annotator reliability of medical events, coreferences and temporal relations in clinical narratives by annotators with varying levels of clinical expertise. ( 0,75833644801966 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,756569900127471 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,756233700365902 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,755402867279715 )
Int J Med Inform - Detecting temporal expressions in medical narratives. ( 0,753661085375346 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,752697848549443 )
J Biomed Inform - UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text. ( 0,752112214367924 )
J Biomed Inform - A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora. ( 0,747470470542131 )
J Biomed Inform - Lexical patterns, features and knowledge resources for coreference resolution in clinical notes. ( 0,747090978599821 )
J Biomed Inform - Towards generating a patient's timeline: extracting temporal relationships from clinical notes. ( 0,746990304586587 )
J Am Med Inform Assoc - A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. ( 0,746008714098776 )
J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,744212600979999 )
J Biomed Inform - A new clustering method for detecting rare senses of abbreviations in clinical notes. ( 0,743147157707282 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,741641424046639 )
J Biomed Inform - Anaphoric reference in clinical reports: characteristics of an annotated corpus. ( 0,739423124969762 )
J Am Med Inform Assoc - Pneumonia identification using statistical feature selection. ( 0,739162310879572 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,733195668325767 )
J Biomed Inform - Desiderata for ontologies to be used in semantic annotation of biomedical documents. ( 0,730519982739177 )
J Biomed Inform - MedTime: a temporal information extraction system for clinical narratives. ( 0,728006955572765 )
J Biomed Inform - Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. ( 0,727359738492132 )
AMIA Annu Symp Proc - A cloud-based approach to medical NLP. ( 0,727073498554438 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,726339293111395 )
AMIA Annu Symp Proc - Natural language processing for lines and devices in portable chest x-rays. ( 0,725999532194019 )
J Am Med Inform Assoc - A hybrid system for temporal information extraction from clinical text. ( 0,723154278616155 )
J Biomed Inform - Identifying non-elliptical entity mentions in a coordinated NP with ellipses. ( 0,723081825028584 )
J Am Med Inform Assoc - Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification. ( 0,722335655367945 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,722060352764976 )
J Am Med Inform Assoc - Automated clinical trial eligibility prescreening: increasing the efficiency of patient identification for clinical trials in the emergency department. ( 0,721714777194941 )
AMIA Annu Symp Proc - Voice-dictated versus typed-in clinician notes: linguistic properties and the potential implications on natural language processing. ( 0,719578744949615 )
J Med Syst - Redactable signatures for signed CDA Documents. ( 0,719005556628446 )
J Am Med Inform Assoc - A comprehensive study of named entity recognition in Chinese clinical text. ( 0,718291216067609 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,715368962504799 )
J Biomed Inform - Extraction of events and temporal expressions from clinical narratives. ( 0,715201836210624 )
J Am Med Inform Assoc - A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. ( 0,710190643764876 )
AMIA Annu Symp Proc - Pharmacovigilance on twitter? Mining tweets for adverse drug reactions. ( 0,707720137543269 )
Int J Med Inform - Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. ( 0,707416079639021 )
J Biomed Inform - Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. ( 0,70623590783134 )
AMIA Annu Symp Proc - Extracting Concepts Related to Homelessness from the Free Text of VA Electronic Medical Records. ( 0,704588991261651 )
J Biomed Inform - NCBI disease corpus: a resource for disease name recognition and concept normalization. ( 0,703968893071303 )
J Biomed Inform - Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. ( 0,703863785093198 )
J Am Med Inform Assoc - Vaccine adverse event text mining system for extracting features from vaccine safety reports. ( 0,703766644343189 )
J Am Med Inform Assoc - MedXN: an open source medication extraction and normalization tool for clinical text. ( 0,702889117579965 )
J Am Med Inform Assoc - Towards comprehensive syntactic and semantic annotations of the clinical narrative. ( 0,702545208095652 )
J Biomed Inform - Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. ( 0,700669433152644 )
AMIA Annu Symp Proc - Building gold standard corpora for medical natural language processing tasks. ( 0,699082068044928 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,696536110603145 )
AMIA Annu Symp Proc - Automatic acquisition of sublanguage semantic schema: towards the word sense disambiguation of clinical narratives. ( 0,696319180015016 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,695181469702855 )
J Biomed Inform - Common data model for natural language processing based on two existing standard information models: CDA+GrAF. ( 0,693876742577975 )
Comput Math Methods Med - Ranking biomedical annotations with annotator's semantic relevancy. ( 0,693501503164236 )
AMIA Annu Symp Proc - Identifying discourse connectives in biomedical text. ( 0,692213722537337 )
AMIA Annu Symp Proc - Mapping annotations with textual evidence using an scLDA model. ( 0,691356551593048 )
J Am Med Inform Assoc - Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation. ( 0,690498882737817 )
J Integr Bioinform - Automatic extraction of microorganisms and their habitats from free text using text mining workflows. ( 0,689910228151683 )
J Biomed Inform - Semantator: semantic annotator for converting biomedical text to linked data. ( 0,688484995538296 )
J Biomed Inform - Text de-identification for privacy protection: a study of its impact on clinical text information content. ( 0,685420071552048 )
AMIA Annu Symp Proc - Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations. ( 0,684601132230524 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,684592031995203 )
AMIA Annu Symp Proc - EpiDEA: extracting structured epilepsy and seizure information from patient discharge summaries for cohort identification. ( 0,683441373735327 )
J Am Med Inform Assoc - Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. ( 0,682744364539266 )
AMIA Annu Symp Proc - Automatic identification of critical follow-up recommendation sentences in radiology reports. ( 0,682356238886233 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,682205207393883 )
J Am Med Inform Assoc - Hybrid methods for improving information access in clinical documents: concept, assertion, and relation identification. ( 0,681252425467328 )
J Am Med Inform Assoc - Assisted annotation of medical free text using RapTAT. ( 0,680683140879012 )
BMC Med Inform Decis Mak - Text summarization as a decision support aid. ( 0,680530513966493 )
J Am Med Inform Assoc - Learning regular expressions for clinical text classification. ( 0,679724176156703 )
J Biomed Inform - Automatically extracting information needs from complex clinical questions. ( 0,678591116309353 )
AMIA Annu Symp Proc - Towards a semantic lexicon for clinical natural language processing. ( 0,67845341206564 )
Comput Methods Programs Biomed - BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments. ( 0,678028234609499 )
AMIA Annu Symp Proc - Automated illustration of patients instructions. ( 0,675803105932166 )
Brief. Bioinformatics - A survey on annotation tools for the biomedical literature. ( 0,67514732320221 )
J Am Med Inform Assoc - Exploiting domain information for Word Sense Disambiguation of medical documents. ( 0,674558660598172 )
AMIA Annu Symp Proc - A machine learning approach for identifying anatomical locations of actionable findings in radiology reports. ( 0,672409428826717 )
J Biomed Inform - Ontology modularization to improve semantic medical image annotation. ( 0,672393360729484 )
Int J Med Inform - Detection of infectious symptoms from VA emergency department and primary care clinical documentation. ( 0,670592987142378 )
AMIA Annu Symp Proc - A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries. ( 0,670291699709529 )
AMIA Annu Symp Proc - Qualitative analysis of workflow modifications used to generate the reference standard for the 2010 i2b2/VA challenge. ( 0,669752279944319 )
J Am Med Inform Assoc - A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge. ( 0,668946034384822 )
J Am Med Inform Assoc - Developing and evaluating an automated appendicitis risk stratification algorithm for pediatric patients in the emergency department. ( 0,667431398532935 )
J Am Med Inform Assoc - Diagnosis code assignment: models and evaluation metrics. ( 0,667039065591173 )
AMIA Annu Symp Proc - Application of a temporal reasoning framework tool in analysis of medical device adverse events. ( 0,666577560551121 )
AMIA Annu Symp Proc - On-time clinical phenotype prediction based on narrative reports. ( 0,664271033292081 )
AMIA Annu Symp Proc - Generalizability and comparison of automatic clinical text de-identification methods and resources. ( 0,664078839630352 )