BMC Med Inform Decis Mak - Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ case(1353) use(1143) diagnosi(1136) }
{ extract(1171) text(1153) clinic(932) }
{ featur(3375) classif(2383) classifi(1994) }
{ perform(1367) use(1326) method(1137) }
{ group(2977) signific(1463) compar(1072) }
{ take(945) account(800) differ(722) }
{ use(2086) technolog(871) perceiv(783) }
{ method(2212) result(1239) propos(1039) }
{ system(1976) rule(880) can(841) }
{ patient(2837) hospit(1953) medic(668) }
{ model(3404) distribut(989) bayesian(671) }
{ patient(2315) diseas(1263) diabet(1191) }
{ howev(809) still(633) remain(590) }
{ medic(1828) order(1363) alert(1069) }
{ can(774) often(719) complex(702) }
{ framework(1458) process(801) describ(734) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ general(901) number(790) one(736) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ research(1085) discuss(1038) issu(1018) }
{ model(2656) set(1616) predict(1553) }
{ activ(1138) subject(705) human(624) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ activ(1452) weight(1219) physic(1104) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ control(1307) perform(991) simul(935) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

CKGROUND: Named entity recognition (NER) is an important task in clinical natural language processing (NLP) research. Machine learning (ML) based NER methods have shown good performance in recognizing entities in clinical text. Algorithms and features are two important factors that largely affect the performance of ML-based NER systems. Conditional Random Fields (CRFs), a sequential labelling algorithm, and Support Vector Machines (SVMs), which is based on large margin theory, are two typical machine learning algorithms that have been widely applied to clinical NER tasks. For features, syntactic and semantic information of context words has often been used in clinical NER systems. However, Structural Support Vector Machines (SSVMs), an algorithm that combines the advantages of both CRFs and SVMs, and word representation features, which contain word-level back-off information over large unlabelled corpus by unsupervised algorithms, have not been extensively investigated for clinical text processing. Therefore, the primary goal of this study is to evaluate the use of SSVMs and word representation features in clinical NER tasks.METHODS: In this study, we developed SSVMs-based NER systems to recognize clinical entities in hospital discharge summaries, using the data set from the concept extration task in the 2010 i2b2 NLP challenge. We compared the performance of CRFs and SSVMs-based NER classifiers with the same feature sets. Furthermore, we extracted two different types of word representation features (clustering-based representation features and distributional representation features) and integrated them with the SSVMs-based clinical NER system. We then reported the performance of SSVM-based NER systems with different types of word representation features.RESULTS AND DISCUSSION: Using the same training (N = 27,837) and test (N = 45,009) sets in the challenge, our evaluation showed that the SSVMs-based NER systems achieved better performance than the CRFs-based systems for clinical entity recognition, when same features were used. Both types of word representation features (clustering-based and distributional representations) improved the performance of ML-based NER systems. By combining two different types of word representation features together with SSVMs, our system achieved a highest F-measure of 85.82%, which outperformed the best system reported in the challenge by 0.6%. Our results show that SSVMs is a great potential algorithm for clinical NLP research, and both types of unsupervised word representation features are beneficial to clinical NER tasks.

Resumo Limpo

ckground name entiti recognit ner import task clinic natur languag process nlp research machin learn ml base ner method shown good perform recogn entiti clinic text algorithm featur two import factor larg affect perform mlbase ner system condit random field crfs sequenti label algorithm support vector machin svms base larg margin theori two typic machin learn algorithm wide appli clinic ner task featur syntact semant inform context word often use clinic ner system howev structur support vector machin ssvms algorithm combin advantag crfs svms word represent featur contain wordlevel backoff inform larg unlabel corpus unsupervis algorithm extens investig clinic text process therefor primari goal studi evalu use ssvms word represent featur clinic ner tasksmethod studi develop ssvmsbase ner system recogn clinic entiti hospit discharg summari use data set concept extrat task ib nlp challeng compar perform crfs ssvmsbase ner classifi featur set furthermor extract two differ type word represent featur clusteringbas represent featur distribut represent featur integr ssvmsbase clinic ner system report perform ssvmbase ner system differ type word represent featuresresult discuss use train n test n set challeng evalu show ssvmsbase ner system achiev better perform crfsbase system clinic entiti recognit featur use type word represent featur clusteringbas distribut represent improv perform mlbase ner system combin two differ type word represent featur togeth ssvms system achiev highest fmeasur outperform best system report challeng result show ssvms great potenti algorithm clinic nlp research type unsupervis word represent featur benefici clinic ner task

Resumos Similares

J Am Med Inform Assoc - Evaluating the state of the art in disorder recognition and normalization of the clinical narrative. ( 0,816632143010516 )
J Am Med Inform Assoc - A rule based solution to co-reference resolution in clinical text. ( 0,766931778270391 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,719865593002829 )
Artif Intell Med - A preclustering-based ensemble learning technique for acute appendicitis diagnoses. ( 0,712802148896535 )
J Am Med Inform Assoc - A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction. ( 0,711571185818677 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,702694961686767 )
AMIA Annu Symp Proc - Part-of-speech tagging for clinical text: wall or bridge between institutions? ( 0,675312442566322 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,672562810447871 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,671262206313631 )
AMIA Annu Symp Proc - Automated non-alphanumeric symbol resolution in clinical texts. ( 0,664698036919787 )
J Biomed Inform - Portable automatic text classification for adverse drug reaction detection via multi-corpus training. ( 0,663732695295231 )
Artif Intell Med - Fuzzy logic-based diagnostic algorithm for implantable cardioverter defibrillators. ( 0,663682858485477 )
J Telemed Telecare - Telecytology for rapid assessment of cytological specimens. ( 0,66116681455453 )
J Am Med Inform Assoc - A comprehensive study of named entity recognition in Chinese clinical text. ( 0,659767062707805 )
J Biomed Inform - Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus. ( 0,659416717324984 )
J Biomed Inform - Automatic detection of patients with invasive fungal disease from free-text computed tomography (CT) scans. ( 0,658730148527168 )
J Biomed Inform - Temporal relation discovery between events and temporal expressions identified in clinical narrative. ( 0,652526525025935 )
AMIA Annu Symp Proc - Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing. ( 0,65216917327775 )
AMIA Annu Symp Proc - Na?ve Electronic Health Record phenotype identification for Rheumatoid arthritis. ( 0,651451698889617 )
AMIA Annu Symp Proc - Text Classification towards Detecting Misdiagnosis of an Epilepsy Syndrome in a Pediatric Population. ( 0,650954599453031 )
J Biomed Inform - Applying active learning to assertion classification of concepts in clinical text. ( 0,650549829427712 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,650033289700007 )
AMIA Annu Symp Proc - A Knowledge Intensive Approach to Mapping Clinical Narrative to LOINC. ( 0,649972312513187 )
Comput Methods Programs Biomed - Computer-aided diagnosis system: a Bayesian hybrid classification method. ( 0,64700187990256 )
AMIA Annu Symp Proc - A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries. ( 0,645846175464827 )
Int J Comput Assist Radiol Surg - Content-based image-retrieval system in chest computed tomography for a solitary pulmonary nodule: method and preliminary experiments. ( 0,64043641721456 )
AMIA Annu Symp Proc - Building gold standard corpora for medical natural language processing tasks. ( 0,63635185672962 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,635997801112526 )
AMIA Annu Symp Proc - Automatically Detecting Acute Myocardial Infarction Events from EHR Text: A Preliminary Study. ( 0,62486387251063 )
J Am Med Inform Assoc - Practical implementation of an existing smoking detection pipeline and reduced support vector machine training corpus requirements. ( 0,623065603253161 )
J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE. ( 0,622249866870776 )
IEEE Trans Image Process - Geodesic propagation for semantic labeling. ( 0,620880731529272 )
Artif Intell Med - Modern parameterization and explanation techniques in diagnostic decision support system: a case study in diagnostics of coronary artery disease. ( 0,620255657158673 )
Neural Comput - Online learning with (multiple) kernels: a review. ( 0,619584762591605 )
Int J Med Inform - De-identification of clinical narratives through writing complexity measures. ( 0,619510320908442 )
J Biomed Inform - Dynamic categorization of clinical research eligibility criteria by hierarchical clustering. ( 0,617205740932312 )
Comput Methods Programs Biomed - Multistage approach for clustering and classification of ECG data. ( 0,615904156852859 )
AMIA Annu Symp Proc - Hyperdimensional computing approach to word sense disambiguation. ( 0,613234957127837 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,612887257190862 )
Artif Intell Med - Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. ( 0,611348488977008 )
Comput. Biol. Med. - Automatic differential diagnosis of pancreatic serous and mucinous cystadenomas based on morphological features. ( 0,610669401670236 )
J Am Med Inform Assoc - Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. ( 0,61024286305434 )
Int J Comput Assist Radiol Surg - Computer-aided diagnostics in digital pathology: automated evaluation of early-phase pancreatic cancer in mice. ( 0,609896220341471 )
Int J Med Inform - Artificial intelligence techniques applied to the development of a decision-support system for diagnosing celiac disease. ( 0,606795865280728 )
Brief. Bioinformatics - Class-imbalanced classifiers for high-dimensional data. ( 0,604516040875369 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,603205322691656 )
Int J Comput Assist Radiol Surg - Disc herniation diagnosis in MRI using a CAD framework and a two-level classifier. ( 0,601099220014776 )
Comput Methods Programs Biomed - An attribute weight assignment and particle swarm optimization algorithm for medical database classifications. ( 0,600323265507773 )
Appl Clin Inform - The contribution of the vaccine adverse event text mining system to the classification of possible Guillain-Barr? syndrome reports. ( 0,599747223340941 )
J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,597138556307322 )
J Med Syst - 3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification. ( 0,59707193656864 )
AMIA Annu Symp Proc - Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations. ( 0,59613409017879 )
J Am Med Inform Assoc - Automated concept-level information extraction to reduce the need for custom software and rules development. ( 0,595243174638282 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,593581297756247 )
IEEE J Biomed Health Inform - Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare. ( 0,592856041841773 )
AMIA Annu Symp Proc - Active Learning-based corpus annotation--the PathoJen experience. ( 0,589049655252745 )
J Am Med Inform Assoc - Using machine learning for concept extraction on clinical documents from multiple data sources. ( 0,585878276864357 )
Artif Intell Med - A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. ( 0,585691073948629 )
J Biomed Inform - Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing. ( 0,585375626581152 )
Artif Intell Med - Exploring a corpus-based approach for detecting language impairment in monolingual English-speaking children. ( 0,584299418053175 )
AMIA Annu Symp Proc - Application of a temporal reasoning framework tool in analysis of medical device adverse events. ( 0,583953038918343 )
J Biomed Inform - Classifying temporal relations in clinical data: a hybrid, knowledge-rich approach. ( 0,582870301702331 )
J Biomed Inform - An enhanced CRFs-based system for information extraction from radiology reports. ( 0,582176556223615 )
AMIA Annu Symp Proc - Learning medical diagnosis models from multiple experts. ( 0,581316831530767 )
J Med Syst - Application of Bayesian classifier for the diagnosis of dental pain. ( 0,579856191433818 )
J Am Med Inform Assoc - Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. ( 0,579099819068663 )
IEEE Trans Image Process - A Probabilistic Associative Model for Segmenting Weakly-Supervised Images. ( 0,578754578754579 )
Med Decis Making - Automatically annotating topics in transcripts of patient-provider interactions via machine learning. ( 0,578400359066543 )
Artif Intell Med - Biomedical visual data analysis to build an intelligent diagnostic decision support system in medical genetics. ( 0,577231422155047 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,576417462944063 )
J Med Syst - Decision support algorithm for diagnosis of ADHD using electroencephalograms. ( 0,574789559893995 )
Lifetime Data Anal - Regression analysis of multivariate recurrent event data with a dependent terminal event. ( 0,574268348682005 )
AMIA Annu Symp Proc - Identifying discourse connectives in biomedical text. ( 0,574148094042531 )
J Am Med Inform Assoc - Vaccine adverse event text mining system for extracting features from vaccine safety reports. ( 0,573679787284917 )
IEEE Trans Neural Netw Learn Syst - ML-Tree: a tree-structure-based approach to multilabel learning. ( 0,571661393185681 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,571636268854408 )
J Am Med Inform Assoc - Named entity recognition of follow-up and time information in 20,000 radiology reports. ( 0,571557457124178 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,571071628883578 )
J Am Med Inform Assoc - Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction. ( 0,570638511814982 )
J Telemed Telecare - Agreement on diagnoses of mental health problems between an online clinical assignment and a routine clinical assignment. ( 0,569891993542208 )
Comput Math Methods Med - Deep learning based syndrome diagnosis of chronic gastritis. ( 0,569761081729611 )
Comput Methods Programs Biomed - BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments. ( 0,568772232379626 )
IEEE Trans Image Process - A linear support higher-order tensor machine for classification. ( 0,567155613437119 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,566806486434713 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,565746077990361 )
Comput. Biol. Med. - Relabeling algorithm for retrieval of noisy instances and improving prediction quality. ( 0,563685665566647 )
J Biomed Inform - Neighborhood hash graph kernel for protein-protein interaction extraction. ( 0,563628486099916 )
AMIA Annu Symp Proc - Generalizability and comparison of automatic clinical text de-identification methods and resources. ( 0,562629996914877 )
Int J Neural Syst - Structurally enhanced incremental neural learning for image classification with subgraph extraction. ( 0,562446476173241 )
Neural Comput - Divergence-based vector quantization. ( 0,561551567407565 )
J. Comput. Biol. - Imbalanced class learning in epigenetics. ( 0,56149666716501 )
AMIA Annu Symp Proc - Improving predictions in imbalanced data using Pairwise Expanded Logistic Regression. ( 0,560846975137546 )
BMC Med Inform Decis Mak - Recognition of medication information from discharge summaries using ensembles of classifiers. ( 0,560137355979158 )
J Am Med Inform Assoc - Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. ( 0,559769088378293 )
Comput Methods Programs Biomed - Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis. ( 0,55939833181409 )
J Am Med Inform Assoc - An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. ( 0,559005509590851 )
J Am Med Inform Assoc - Pneumonia identification using statistical feature selection. ( 0,558767554703235 )
Int J Comput Assist Radiol Surg - Structured reporting using a shared indexed multilingual radiology lexicon. ( 0,55853930715276 )
J Med Syst - Prediction of similarities among rheumatic diseases. ( 0,558509001663624 )
J Am Med Inform Assoc - Supervised machine learning and active learning in classification of radiology reports. ( 0,557998061479788 )