Artif Intell Med - Adaptation of machine translation for multilingual information retrieval in the medical domain.

Tópicos

{ concept(1167) ontolog(924) domain(897) }
{ perform(999) metric(946) measur(919) }
{ search(2224) databas(1162) retriev(909) }
{ method(2212) result(1239) propos(1039) }
{ method(1557) propos(1049) approach(1037) }
{ data(2317) use(1299) case(1017) }
{ implement(1333) system(1263) develop(1122) }
{ imag(2830) propos(1344) filter(1198) }
{ system(1976) rule(880) can(841) }
{ studi(2440) review(1878) systemat(933) }
{ general(901) number(790) one(736) }
{ perform(1367) use(1326) method(1137) }
{ take(945) account(800) differ(722) }
{ assess(1506) score(1403) qualiti(1306) }
{ research(1085) discuss(1038) issu(1018) }
{ visual(1396) interact(850) tool(830) }
{ research(1218) medic(880) student(794) }
{ cost(1906) reduc(1198) effect(832) }
{ time(1939) patient(1703) rate(768) }
{ inform(2794) health(2639) internet(1427) }
{ featur(3375) classif(2383) classifi(1994) }
{ featur(1941) imag(1645) propos(1176) }
{ system(1050) medic(1026) inform(1018) }
{ compound(1573) activ(1297) structur(1058) }
{ studi(1119) effect(1106) posit(819) }
{ gene(2352) biolog(1181) express(1162) }
{ activ(1138) subject(705) human(624) }
{ analysi(2126) use(1163) compon(1037) }
{ cancer(2502) breast(956) screen(824) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ measur(2081) correl(1212) valu(896) }
{ framework(1458) process(801) describ(734) }
{ chang(1828) time(1643) increas(1301) }
{ clinic(1479) use(1117) guidelin(835) }
{ data(1714) softwar(1251) tool(1186) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ model(2341) predict(2261) use(1141) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ state(1844) use(1261) util(961) }
{ medic(1828) order(1363) alert(1069) }
{ sampl(1606) size(1419) use(1276) }
{ survey(1388) particip(1329) question(1065) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ learn(2355) train(1041) set(1003) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ import(1318) role(1303) understand(862) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ signal(2180) analysi(812) frequenc(800) }
{ group(2977) signific(1463) compar(1072) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

JECTIVE: We investigate machine translation (MT) of user search queries in the context of cross-lingual information retrieval (IR) in the medical domain. The main focus is on techniques to adapt MT to increase translation quality; however, we also explore MT adaptation to improve effectiveness of cross-lingual IR.METHODS AND DATA: Our MT system is Moses, a state-of-the-art phrase-based statistical machine translation system. The IR system is based on the BM25 retrieval model implemented in the Lucene search engine. The MT techniques employed in this work include in-domain training and tuning, intelligent training data selection, optimization of phrase table configuration, compound splitting, and exploiting synonyms as translation variants. The IR methods include morphological normalization and using multiple translation variants for query expansion. The experiments are performed and thoroughly evaluated on three language pairs: Czech-English, German-English, and French-English. MT quality is evaluated on data sets created within the Khresmoi project and IR effectiveness is tested on the CLEF eHealth 2013 data sets.RESULTS: The search query translation results achieved in our experiments are outstanding - our systems outperform not only our strong baselines, but also Google Translate and Microsoft Bing Translator in direct comparison carried out on all the language pairs. The baseline BLEU scores increased from 26.59 to 41.45 for Czech-English, from 23.03 to 40.82 for German-English, and from 32.67 to 40.82 for French-English. This is a 55% improvement on average. In terms of the IR performance on this particular test collection, a significant improvement over the baseline is achieved only for French-English. For Czech-English and German-English, the increased MT quality does not lead to better IR results.CONCLUSIONS: Most of the MT techniques employed in our experiments improve MT of medical search queries. Especially the intelligent training data selection proves to be very successful for domain adaptation of MT. Certain improvements are also obtained from German compound splitting on the source language side. Translation quality, however, does not appear to correlate with the IR performance - better translation does not necessarily yield better retrieval. We discuss in detail the contribution of the individual techniques and state-of-the-art features and provide future research directions.

Resumo Limpo

jectiv investig machin translat mt user search queri context crosslingu inform retriev ir medic domain main focus techniqu adapt mt increas translat qualiti howev also explor mt adapt improv effect crosslingu irmethod data mt system mose stateoftheart phrasebas statist machin translat system ir system base bm retriev model implement lucen search engin mt techniqu employ work includ indomain train tune intellig train data select optim phrase tabl configur compound split exploit synonym translat variant ir method includ morpholog normal use multipl translat variant queri expans experi perform thorough evalu three languag pair czechenglish germanenglish frenchenglish mt qualiti evalu data set creat within khresmoi project ir effect test clef ehealth data setsresult search queri translat result achiev experi outstand system outperform strong baselin also googl translat microsoft bing translat direct comparison carri languag pair baselin bleu score increas czechenglish germanenglish frenchenglish improv averag term ir perform particular test collect signific improv baselin achiev frenchenglish czechenglish germanenglish increas mt qualiti lead better ir resultsconclus mt techniqu employ experi improv mt medic search queri especi intellig train data select prove success domain adapt mt certain improv also obtain german compound split sourc languag side translat qualiti howev appear correl ir perform better translat necessarili yield better retriev discuss detail contribut individu techniqu stateoftheart featur provid futur research direct

Resumos Similares

J. Med. Internet Res. - A search engine to access PubMed monolingual subsets: proof of concept and evaluation in French. ( 0,714142134520839 )
AMIA Annu Symp Proc - Statistical machine translation for biomedical text: are we there yet? ( 0,692589276664384 )
AMIA Annu Symp Proc - Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study. ( 0,689440662216347 )
AMIA Annu Symp Proc - Evaluation of semantic-based information retrieval methods in the autism phenotype domain. ( 0,673655782465309 )
Comput Methods Programs Biomed - BiOSS: A system for biomedical ontology selection. ( 0,655307004806729 )
J Chem Inf Model - Speeding up chemical searches using the inverted index: the convergence of chemoinformatics and text search methods. ( 0,636113971196302 )
AMIA Annu Symp Proc - Leveraging user query sessions to improve searching of medical literature. ( 0,634774804537305 )
J Med Syst - Automated mapping of clinical terms into SNOMED-CT. An application to codify procedures in pathology. ( 0,631804591997973 )
AMIA Annu Symp Proc - Deterministic binary vectors for efficient automated indexing of MEDLINE/PubMed abstracts. ( 0,631390510168007 )
Methods Inf Med - Toward a view-oriented approach for aligning RDF-based biomedical repositories. ( 0,625054110211781 )
Brief. Bioinformatics - Evaluation of research in biomedical ontologies. ( 0,625032923929888 )
J Biomed Inform - A hierarchical knowledge-based approach for retrieving similar medical images described with semantic annotations. ( 0,624943397296154 )
J Am Med Inform Assoc - Terminology challenges implementing the HL7 context-aware knowledge retrieval ('Infobutton') standard. ( 0,612726201364899 )
J Biomed Inform - A comparison of evaluation metrics for biomedical journals, articles, and websites in terms of sensitivity to topic. ( 0,61230892978703 )
J Am Med Inform Assoc - Leveraging medical thesauri and physician feedback for improving medical literature retrieval for case queries. ( 0,611147140326991 )
AMIA Annu Symp Proc - NeuroLOG: sharing neuroimaging data using an ontology-based federated approach. ( 0,609145070743877 )
J Biomed Inform - Improving MeSH classification of biomedical articles using citation contexts. ( 0,605265740271673 )
J Biomed Inform - Mammogram retrieval through machine learning within BI-RADS standards. ( 0,604411277532083 )
BMC Med Inform Decis Mak - e-MIR2: a public online inventory of medical informatics resources. ( 0,600265475465118 )
Methods Inf Med - Prioritising lexical patterns to increase axiomatisation in biomedical ontologies. The role of localisation and modularity. ( 0,599519709277303 )
J Biomed Inform - Validating the semantics of a medical iconic language using ontological reasoning. ( 0,599212508259003 )
AMIA Annu Symp Proc - Can SNOMED CT fulfill the vision of a compositional terminology? Analyzing the use case for problem list. ( 0,596103449524875 )
J Biomed Inform - vSPARQL: a view definition language for the semantic web. ( 0,594184975671592 )
J Biomed Inform - An ontology-based measure to compute semantic similarity in biomedicine. ( 0,593109624114487 )
J Biomed Inform - A hybrid knowledge-based and data-driven approach to identifying semantically similar concepts. ( 0,592888224302395 )
AMIA Annu Symp Proc - Does query expansion limit our learning? A comparison of social-based expansion to content-based expansion for medical queries on the internet. ( 0,592372583443373 )
J Biomed Inform - Quality evaluation of cancer study Common Data Elements using the UMLS Semantic Network. ( 0,591424285559115 )
J Biomed Inform - Supporting retrieval of diverse biomedical data using evidence-aware queries. ( 0,589830874880468 )
BMC Med Inform Decis Mak - Mapping turnaround times (TAT) to a generic timeline: a systematic review of TAT definitions in clinical domains. ( 0,587439474603341 )
Artif Intell Med - A semantic graph-based approach to biomedical summarisation. ( 0,586990161404246 )
Artif Intell Med - The Foundational Model of Anatomy in OWL 2 and its use. ( 0,586515758134851 )
BMC Med Inform Decis Mak - Translating the Foundational Model of Anatomy into French using knowledge-based and lexical methods. ( 0,582900734994047 )
Comput Methods Programs Biomed - Searching biosignal databases by content and context: Research Oriented Integration System for ECG Signals (ROISES). ( 0,580691184843359 )
Artif Intell Med - Factors affecting the effectiveness of biomedical document indexing and retrieval based on terminologies. ( 0,57899514175142 )
AMIA Annu Symp Proc - Scalable and High-Throughput Execution of Clinical Quality Measures from Electronic Health Records using MapReduce and the JBoss? Drools Engine. ( 0,578601323045058 )
J Biomed Inform - SNOMED CT module-driven clinical archetype management. ( 0,577514478469465 )
J Biomed Inform - Improving search over Electronic Health Records using UMLS-based query expansion through random walks. ( 0,577059776784919 )
J. Comput. Biol. - Evaluating the significance of protein functional similarity based on gene ontology. ( 0,575152990094906 )
J Biomed Inform - A comparison of two methods for retrieving ICD-9-CM data: the effect of using an ontology-based method for handling terminology changes. ( 0,574841651155128 )
J Am Med Inform Assoc - Benchmarking health IT among OECD countries: better data for better policy. ( 0,572118062395409 )
J Biomed Inform - Using LOINC to link 10 terminology standards to one unified standard in a specialized domain. ( 0,569094820659549 )
BMC Med Inform Decis Mak - Mining biomarker information in biomedical literature. ( 0,568853127281456 )
J Biomed Inform - A network-theoretic approach for decompositional translation across Open Biological Ontologies. ( 0,568000332206137 )
Int J Med Inform - Semantic similarity-based alignment between clinical archetypes and SNOMED CT: an application to observations. ( 0,567601153396889 )
J Biomed Inform - Development and evaluation of an ontology for guiding appropriate antibiotic prescribing. ( 0,566732992947469 )
J Biomed Inform - An approach for the semantic interoperability of ISO EN 13606 and OpenEHR archetypes. ( 0,566681899248884 )
Int J Med Inform - FindZebra: a search engine for rare diseases. ( 0,566572995085418 )
Int J Med Inform - A modified Delphi translation strategy and challenges of International Classification for Nursing Practice (ICNP?). ( 0,565469142424576 )
J Biomed Inform - Discovering beaten paths in collaborative ontology-engineering projects using Markov chains. ( 0,563579715625726 )
Methods Inf Med - Spatial-symbolic Query Engine in Anatomy. ( 0,559296712678354 )
J Biomed Inform - Leveraging concept-based approaches to identify potential phyto-therapies. ( 0,559111394188402 )
J Am Med Inform Assoc - Life sciences domain analysis model. ( 0,558715847304026 )
Methods Inf Med - An evolutionary approach to realism-based adverse event representations. ( 0,558201500101989 )
IEEE Trans Image Process - DSIM: a DisSIMilarity-based image clutter metric for targeting performance. ( 0,556206308680244 )
BMC Med Inform Decis Mak - Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of Cerebrotendinous xanthomatosis. ( 0,555702469262509 )
AMIA Annu Symp Proc - Identifying Granularity Differences between Large Biomedical Ontologies through Rules. ( 0,554277923852002 )
Methods Inf Med - Putting biomedical ontologies to work. ( 0,552127681603933 )
Int J Med Inform - MEDRank: using graph-based concept ranking to index biomedical texts. ( 0,551625594294598 )
J Med Syst - SeDeLo: using semantics and description logics to support aided clinical diagnosis. ( 0,551371489465256 )
J Biomed Inform - Natural Language Processing methods and systems for biomedical ontology learning. ( 0,551027430415062 )
J Am Med Inform Assoc - Scalable quality assurance for large SNOMED CT hierarchies using subject-based subtaxonomies. ( 0,550696878879794 )
AMIA Annu Symp Proc - Auditing SNOMED Integration into the UMLS for Duplicate Concepts. ( 0,548934831524749 )
IEEE J Biomed Health Inform - Semantic Normalization and Query Abstraction Based on SNOMED-CT and HL7: Supporting Multicentric Clinical Trials. ( 0,548477850578013 )
J Biomed Inform - Integrating reasoning and clinical archetypes using OWL ontologies and SWRL rules. ( 0,547311594979752 )
J Biomed Inform - Translating standards into practice - one Semantic Web API for Gene Expression. ( 0,546641586982098 )
J Biomed Inform - Reuse of termino-ontological resources and text corpora for building a multilingual domain ontology: an application to Alzheimer's disease. ( 0,546584728947896 )
AMIA Annu Symp Proc - Modeling and executing electronic health records driven phenotyping algorithms using the NQF Quality Data Model and JBoss? Drools Engine. ( 0,546391787139976 )
AMIA Annu Symp Proc - Evaluation of RxNorm for Representing Ambulatory Prescriptions. ( 0,54578958409674 )
J Biomed Inform - Semantic mappings and locality of nursing diagnostic concepts in UMLS. ( 0,544755991897019 )
J Biomed Inform - A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. ( 0,543803418803419 )
J Biomed Inform - Comparing different knowledge sources for the automatic summarization of biomedical literature. ( 0,543251877017712 )
Brief. Bioinformatics - Fast and efficient searching of biological data resources--using EB-eye. ( 0,541970919420422 )
Perspect Health Inf Manag - Personal health records: is rapid adoption hindering interoperability? ( 0,541970919420422 )
AMIA Annu Symp Proc - Looking for Anemia (and Other Disorders) in SNOMED CT: Comparison of Three Approaches and Practical Implications. ( 0,540245183383624 )
J Am Med Inform Assoc - Automatically extracting sentences from Medline citations to support clinicians' information needs. ( 0,539366189302946 )
J Biomed Inform - A federated semantic metadata registry framework for enabling interoperability across clinical research and care domains. ( 0,538735281502839 )
AMIA Annu Symp Proc - Fostering Multilinguality in the UMLS: A Computational Approach to Terminology Expansion for Multiple Languages. ( 0,537674993068146 )
J Biomed Inform - BOAT: automatic alignment of biomedical ontologies using term informativeness and candidate selection. ( 0,535790027657869 )
AMIA Annu Symp Proc - Characterizing semantic mappings adaptation via biomedical KOS evolution: a case study investigating SNOMED CT and ICD. ( 0,53558534027237 )
AMIA Annu Symp Proc - A family-based framework for supporting quality assurance of biomedical ontologies in BioPortal. ( 0,534681764169105 )
Brief. Bioinformatics - Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees. ( 0,533293267725845 )
Methods Inf Med - Biomedical ontologies: toward scientific debate. ( 0,532444670234875 )
Methods Inf Med - Development of ICD-10-TM ontology for a semi-automated morbidity coding system in Thailand. ( 0,532338215850812 )
J Chem Inf Model - DEKOIS: demanding evaluation kits for objective in silico screening--a versatile tool for benchmarking docking programs and scoring functions. ( 0,531687648492959 )
AMIA Annu Symp Proc - Towards the creation of a visual ontology of biomedical imaging entities. ( 0,531428276843331 )
J Biomed Inform - Enabling international adoption of LOINC through translation. ( 0,530352858423257 )
Methods Inf Med - An architecture for diversity-aware search for medical web content. ( 0,529716491016032 )
Int J Med Inform - Implications of SNOMED CT versioning. ( 0,529706364737041 )
Methods Inf Med - Developing topic-specific search filters for PubMed with click-through data. ( 0,528901295824156 )
J. Med. Internet Res. - Cumulative query method for influenza surveillance using search engine data. ( 0,527251238138702 )
Methods Inf Med - Evaluation of the content coverage of SNOMED CT representing ICNP seven-axis version 1 concepts. ( 0,527008373377659 )
J Biomed Inform - OWL-based reasoning methods for validating archetypes. ( 0,526503934621048 )
Comput Methods Programs Biomed - METRADISC-XL: a program for meta-analysis of multidimensional ranked discovery oriented datasets including microarrays. ( 0,526350145169865 )
J Biomed Inform - Cross-domain targeted ontology subsets for annotation: the case of SNOMED CORE and RxNorm. ( 0,525784007155533 )
J Am Med Inform Assoc - A semantic-web oriented representation of the clinical element model for secondary use of electronic health records data. ( 0,525763007221656 )
IEEE Trans Image Process - Inverse halftoning based on the bayesian theorem. ( 0,524309443956367 )
J Biomed Inform - TRAK ontology: defining standard care for the rehabilitation of knee conditions. ( 0,52412421865546 )
J Biomed Inform - A graph-based recovery and decomposition of Swanson's hypothesis using semantic predications. ( 0,523998556905459 )
J. Med. Internet Res. - Definition of Health 2.0 and Medicine 2.0: a systematic review. ( 0,523036037840205 )
J Biomed Inform - Acquisition and evaluation of verb subcategorization resources for biomedicine. ( 0,522957906579176 )