J Biomed Inform - Determining the difficulty of Word Sense Disambiguation.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ extract(1171) text(1153) clinic(932) }
{ search(2224) databas(1162) retriev(909) }
{ can(774) often(719) complex(702) }
{ data(1737) use(1416) pattern(1282) }
{ method(2212) result(1239) propos(1039) }
{ implement(1333) system(1263) develop(1122) }
{ algorithm(1844) comput(1787) effici(935) }
{ model(2341) predict(2261) use(1141) }
{ can(981) present(881) function(850) }
{ decis(3086) make(1611) patient(1517) }
{ model(2656) set(1616) predict(1553) }
{ use(1733) differ(960) four(931) }
{ method(1557) propos(1049) approach(1037) }
{ case(1353) use(1143) diagnosi(1136) }
{ cost(1906) reduc(1198) effect(832) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1057) registr(996) error(939) }
{ method(1219) similar(1157) match(930) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ assess(1506) score(1403) qualiti(1306) }
{ clinic(1479) use(1117) guidelin(835) }
{ data(1714) softwar(1251) tool(1186) }
{ howev(809) still(633) remain(590) }
{ import(1318) role(1303) understand(862) }
{ model(3480) simul(1196) paramet(876) }
{ state(1844) use(1261) util(961) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ first(2504) two(1366) second(1323) }
{ use(976) code(926) identifi(902) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ imag(2830) propos(1344) filter(1198) }
{ patient(2315) diseas(1263) diabet(1191) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ featur(1941) imag(1645) propos(1176) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Automatic processing of biomedical documents is made difficult by the fact that many of the terms they contain are ambiguous. Word Sense Disambiguation (WSD) systems attempt to resolve these ambiguities and identify the correct meaning. However, the published literature on WSD systems for biomedical documents report considerable differences in performance for different terms. The development of WSD systems is often expensive with respect to acquiring the necessary training data. It would therefore be useful to be able to predict in advance which terms WSD systems are likely to perform well or badly on. This paper explores various methods for estimating the performance of WSD systems on a wide range of ambiguous biomedical terms (including ambiguous words/phrases and abbreviations). The methods include both supervised and unsupervised approaches. The supervised approaches make use of information from labeled training data while the unsupervised ones rely on the UMLS Metathesaurus. The approaches are evaluated by comparing their predictions about how difficult disambiguation will be for ambiguous terms against the output of two WSD systems. We find the supervised methods are the best predictors of WSD difficulty, but are limited by their dependence on labeled training data. The unsupervised methods all perform well in some situations and can be applied more widely.

Resumo Limpo

automat process biomed document made difficult fact mani term contain ambigu word sens disambigu wsd system attempt resolv ambigu identifi correct mean howev publish literatur wsd system biomed document report consider differ perform differ term develop wsd system often expens respect acquir necessari train data therefor use abl predict advanc term wsd system like perform well bad paper explor various method estim perform wsd system wide rang ambigu biomed term includ ambigu wordsphras abbrevi method includ supervis unsupervis approach supervis approach make use inform label train data unsupervis one reli uml metathesaurus approach evalu compar predict difficult disambigu will ambigu term output two wsd system find supervis method best predictor wsd difficulti limit depend label train data unsupervis method perform well situat can appli wide

Resumos Similares

BMC Med Inform Decis Mak - Dynamic summarization of bibliographic-based data. ( 0,646149689250406 )
J Am Med Inform Assoc - Recommending MeSH terms for annotating biomedical articles. ( 0,633511532832329 )
J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE. ( 0,632612694960633 )
J Biomed Inform - Knowledge based word-concept model estimation and refinement for biomedical text mining. ( 0,632531137408255 )
AMIA Annu Symp Proc - Hyperdimensional computing approach to word sense disambiguation. ( 0,631042988840296 )
BMC Med Inform Decis Mak - Mining biomarker information in biomedical literature. ( 0,630961520787863 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,628715445020968 )
J Biomed Inform - Classifying temporal relations in clinical data: a hybrid, knowledge-rich approach. ( 0,615287703814327 )
AMIA Annu Symp Proc - A Comprehensive Analysis of Five Million UMLS Metathesaurus Terms Using Eighteen Million MEDLINE Citations. ( 0,614073617769009 )
AMIA Annu Symp Proc - Synonym, topic model and predicate-based query expansion for retrieving clinical documents. ( 0,608142814107446 )
Comput Methods Programs Biomed - Multistage approach for clustering and classification of ECG data. ( 0,607934609335746 )
IEEE Trans Pattern Anal Mach Intell - On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval. ( 0,60188187034267 )
J Am Med Inform Assoc - Induced lexico-syntactic patterns improve information extraction from online medical forums. ( 0,601173682801957 )
J Am Med Inform Assoc - A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction. ( 0,589027507861833 )
Int J Med Inform - An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics. ( 0,587099421835577 )
Methods Inf Med - Developing topic-specific search filters for PubMed with click-through data. ( 0,584599924685871 )
BMC Med Inform Decis Mak - Combining classifiers for robust PICO element detection. ( 0,580840355156148 )
IEEE Trans Image Process - Coaching the exploration and exploitation in active learning for interactive video retrieval. ( 0,575809128902209 )
Methods Inf Med - Chi-square-based scoring function for categorization of MEDLINE citations. ( 0,56863359617198 )
J Integr Bioinform - Evaluating the effect of unbalanced data in biomedical document classification. ( 0,567925450714031 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,567105581386485 )
J Biomed Inform - Using a shallow linguistic kernel for drug-drug interaction extraction. ( 0,564894131721607 )
BMC Med Inform Decis Mak - BOSS: context-enhanced search for biomedical objects. ( 0,564034679136242 )
J Biomed Inform - A knowledge-driven conditional approach to extract pharmacogenomics specific drug-gene relationships from free text. ( 0,563885932473297 )
Brief. Bioinformatics - Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees. ( 0,56162052760785 )
J Biomed Inform - Applying active learning to assertion classification of concepts in clinical text. ( 0,558001797376788 )
J Biomed Inform - Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature. ( 0,557174528419736 )
Comput. Biol. Med. - Parsing citations in biomedical articles using conditional random fields. ( 0,557062080680613 )
AMIA Annu Symp Proc - Part-of-speech tagging for clinical text: wall or bridge between institutions? ( 0,552783207417647 )
BMC Med Inform Decis Mak - Detecting causality from online psychiatric texts using inter-sentential language patterns. ( 0,552531241481436 )
BMC Med Inform Decis Mak - Information discovery on electronic health records using authority flow techniques. ( 0,552185920258484 )
AMIA Annu Symp Proc - Deterministic binary vectors for efficient automated indexing of MEDLINE/PubMed abstracts. ( 0,55159928426661 )
BMC Med Inform Decis Mak - Decision tree-based learning to predict patient controlled analgesia consumption and readjustment. ( 0,549863828974726 )
Comput Methods Programs Biomed - Alternatives to relational database: comparison of NoSQL and XML approaches for clinical data storage. ( 0,5476632987147 )
J Chem Inf Model - Improved chemical text mining of patents with infinite dictionaries and automatic spelling correction. ( 0,547296870507994 )
AMIA Annu Symp Proc - Active Learning-based corpus annotation--the PathoJen experience. ( 0,545508573016174 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,544476371420436 )
J. Med. Internet Res. - Automatic evidence retrieval for systematic reviews. ( 0,544241074792585 )
J Biomed Inform - MeSHy: Mining unanticipated PubMed information using frequencies of occurrences and concurrences of MeSH terms. ( 0,541291254343242 )
AMIA Annu Symp Proc - Finding and accessing diagrams in biomedical publications. ( 0,54109408449471 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,538172343571747 )
Artif Intell Med - Development of traditional Chinese medicine clinical data warehouse for medical knowledge discovery and decision support. ( 0,535109502732023 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,533838094694552 )
J Am Med Inform Assoc - The BioIntelligence Framework: a new computational platform for biomedical knowledge computing. ( 0,53223091057052 )
AMIA Annu Symp Proc - Evaluating the Importance of Image-related Text for Ad-hoc and Case-based Biomedical Article Retrieval. ( 0,531813797124554 )
AMIA Annu Symp Proc - An automated approach for ranking journals to help in clinician decision support. ( 0,530919030644345 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,529173233840582 )
Neural Comput - Adaptive metric learning vector quantization for ordinal classification. ( 0,528480707136499 )
Int J Neural Syst - Span: spike pattern association neuron for learning spatio-temporal spike patterns. ( 0,527677092332475 )
J Biomed Inform - Portable automatic text classification for adverse drug reaction detection via multi-corpus training. ( 0,526379070007353 )
IEEE Trans Image Process - Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion. ( 0,526264722207598 )
J Biomed Inform - Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus. ( 0,525077673542529 )
Int J Med Inform - A methodology to enhance spatial understanding of disease outbreak events reported in news articles. ( 0,52372767754977 )
J Biomed Inform - Reflective random indexing for semi-automatic indexing of the biomedical literature. ( 0,523471108817445 )
IEEE Trans Image Process - Artistic image analysis using graph-based learning approaches. ( 0,523320537016715 )
J Med Syst - 3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification. ( 0,521484381138248 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,520008233568115 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,519253118416562 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,518318419515285 )
BMC Med Inform Decis Mak - Discovering context-specific relationships from biological literature by using multi-level context terms. ( 0,518053119331323 )
BMC Med Inform Decis Mak - Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features. ( 0,517651782083922 )
Comput Math Methods Med - Biomarker identification using text mining. ( 0,517407924815591 )
AMIA Annu Symp Proc - Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations. ( 0,516820384598715 )
J Am Med Inform Assoc - Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. ( 0,516116186520846 )
Comput Methods Programs Biomed - BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments. ( 0,515217754933676 )
J. Med. Internet Res. - Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. ( 0,514297521698592 )
Brief. Bioinformatics - Multi-stage learning aids applied to hands-on software training. ( 0,514121216845744 )
IEEE Trans Image Process - Comparison of texture analysis schemes under nonideal conditions. ( 0,513547927080709 )
J Biomed Inform - Automatic generation of investigator bibliographies for institutional research networking systems. ( 0,513375449368794 )
AMIA Annu Symp Proc - A bottom-up approach to MEDLINE indexing recommendations. ( 0,512016384786474 )
AMIA Annu Symp Proc - Semantic annotation of clinical events for generating a problem list. ( 0,511321839083821 )
IEEE Trans Neural Netw Learn Syst - A Kernel Classification Framework for Metric Learning. ( 0,51079836165984 )
AMIA Annu Symp Proc - Parenthetically speaking: classifying the contents of parentheses for text mining. ( 0,510559720711656 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,510322837964126 )
J Biomed Inform - Using PharmGKB to train text mining approaches for identifying potential gene targets for pharmacogenomic studies. ( 0,509508948515626 )
J Biomed Inform - A mutation-centric approach to identifying pharmacogenomic relations in text. ( 0,508824621736851 )
J Biomed Inform - Mining association language patterns using a distributional semantic model for negative life event classification. ( 0,508719760915236 )
IEEE Trans Image Process - Saliency and gist features for target detection in satellite images. ( 0,507323103722155 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data. ( 0,50701042146932 )
J. Comput. Biol. - Imbalanced class learning in epigenetics. ( 0,505645132248465 )
IEEE Trans Image Process - Improving Web image search by bag-based reranking. ( 0,504869972376359 )
IEEE Trans Image Process - A Probabilistic Associative Model for Segmenting Weakly-Supervised Images. ( 0,504535907632989 )
Neural Comput - Divergence-based vector quantization. ( 0,504183167118603 )
J Integr Bioinform - The LAILAPS search engine: relevance ranking in life science databases. ( 0,503761218149061 )
J Biomed Inform - Temporal relation discovery between events and temporal expressions identified in clinical narrative. ( 0,503392122191984 )
AMIA Annu Symp Proc - Sophia: A Expedient UMLS Concept Extraction Annotator. ( 0,502493011934676 )
J Am Med Inform Assoc - Learning classification models with soft-label information. ( 0,50193925074982 )
IEEE J Biomed Health Inform - Multiple kernel learning in the primal for multimodal Alzheimer's disease classification. ( 0,500972358190295 )
Comput. Biol. Med. - Identification of voltage-gated potassium channel subfamilies from sequence information using support vector machine. ( 0,500902900727755 )
AMIA Annu Symp Proc - Automatically classifying the role of citations in biomedical articles. ( 0,500609549836906 )
AMIA Annu Symp Proc - Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining. ( 0,500488305554966 )
J Biomed Inform - Reducing systematic review workload through certainty-based screening. ( 0,498545954378612 )
IEEE Trans Image Process - 3D object retrieval with multitopic model combining relevance feedback and LDA model. ( 0,497996519256058 )
Int J Neural Syst - Structurally enhanced incremental neural learning for image classification with subgraph extraction. ( 0,497832673975705 )
AMIA Annu Symp Proc - It's about this and that: a description of anaphoric expressions in clinical text. ( 0,497526276612093 )
J Chem Inf Model - Speeding up chemical searches using the inverted index: the convergence of chemoinformatics and text search methods. ( 0,497304562574072 )
J Am Med Inform Assoc - A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge. ( 0,497244379997077 )
AMIA Annu Symp Proc - Automatically Detecting Acute Myocardial Infarction Events from EHR Text: A Preliminary Study. ( 0,496497118361391 )
IEEE Trans Neural Netw Learn Syst - Adaptive Batch Mode Active Learning. ( 0,496154898777512 )
J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. ( 0,495273564687617 )