Int J Med Inform - Text mining of cancer-related information: review of current status and future directions.

Tópicos

{ research(1085) discuss(1038) issu(1018) }
{ extract(1171) text(1153) clinic(932) }
{ state(1844) use(1261) util(961) }
{ learn(2355) train(1041) set(1003) }
{ cancer(2502) breast(956) screen(824) }
{ medic(1828) order(1363) alert(1069) }
{ take(945) account(800) differ(722) }
{ search(2224) databas(1162) retriev(909) }
{ implement(1333) system(1263) develop(1122) }
{ data(1737) use(1416) pattern(1282) }
{ method(1219) similar(1157) match(930) }
{ chang(1828) time(1643) increas(1301) }
{ method(1557) propos(1049) approach(1037) }
{ system(1050) medic(1026) inform(1018) }
{ studi(1119) effect(1106) posit(819) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ studi(1410) differ(1259) use(1210) }
{ perform(1367) use(1326) method(1137) }
{ use(976) code(926) identifi(902) }
{ imag(1057) registr(996) error(939) }
{ concept(1167) ontolog(924) domain(897) }
{ general(901) number(790) one(736) }
{ howev(809) still(633) remain(590) }
{ record(1888) medic(1808) patient(1693) }
{ group(2977) signific(1463) compar(1072) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ measur(2081) correl(1212) valu(896) }
{ studi(2440) review(1878) systemat(933) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ compound(1573) activ(1297) structur(1058) }
{ health(3367) inform(1360) care(1135) }
{ signal(2180) analysi(812) frequenc(800) }
{ gene(2352) biolog(1181) express(1162) }
{ patient(1821) servic(1111) care(1106) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ clinic(1479) use(1117) guidelin(835) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ cost(1906) reduc(1198) effect(832) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

RPOSE: This paper reviews the research literature on text mining (TM) with the aim to find out (1) which cancer domains have been the subject of TM efforts, (2) which knowledge resources can support TM of cancer-related information and (3) to what extent systems that rely on knowledge and computational methods can convert text data into useful clinical information. These questions were used to determine the current state of the art in this particular strand of TM and suggest future directions in TM development to support cancer research.METHODS: A review of the research on TM of cancer-related information was carried out. A literature search was conducted on the Medline database as well as IEEE Xplore and ACM digital libraries to address the interdisciplinary nature of such research. The search results were supplemented with the literature identified through Google Scholar.RESULTS: A range of studies have proven the feasibility of TM for extracting structured information from clinical narratives such as those found in pathology or radiology reports. In this article, we provide a critical overview of the current state of the art for TM related to cancer. The review highlighted a strong bias towards symbolic methods, e.g. named entity recognition (NER) based on dictionary lookup and information extraction (IE) relying on pattern matching. The F-measure of NER ranges between 80% and 90%, while that of IE for simple tasks is in the high 90s. To further improve the performance, TM approaches need to deal effectively with idiosyncrasies of the clinical sublanguage such as non-standard abbreviations as well as a high degree of spelling and grammatical errors. This requires a shift from rule-based methods to machine learning following the success of similar trends in biological applications of TM. Machine learning approaches require large training datasets, but clinical narratives are not readily available for TM research due to privacy and confidentiality concerns. This issue remains the main bottleneck for progress in this area. In addition, there is a need for a comprehensive cancer ontology that would enable semantic representation of textual information found in narrative reports.

Resumo Limpo

rpose paper review research literatur text mine tm aim find cancer domain subject tm effort knowledg resourc can support tm cancerrel inform extent system reli knowledg comput method can convert text data use clinic inform question use determin current state art particular strand tm suggest futur direct tm develop support cancer researchmethod review research tm cancerrel inform carri literatur search conduct medlin databas well ieee xplore acm digit librari address interdisciplinari natur research search result supplement literatur identifi googl scholarresult rang studi proven feasibl tm extract structur inform clinic narrat found patholog radiolog report articl provid critic overview current state art tm relat cancer review highlight strong bias toward symbol method eg name entiti recognit ner base dictionari lookup inform extract ie reli pattern match fmeasur ner rang ie simpl task high s improv perform tm approach need deal effect idiosyncrasi clinic sublanguag nonstandard abbrevi well high degre spell grammat error requir shift rulebas method machin learn follow success similar trend biolog applic tm machin learn approach requir larg train dataset clinic narrat readili avail tm research due privaci confidenti concern issu remain main bottleneck progress area addit need comprehens cancer ontolog enabl semant represent textual inform found narrat report

Resumos Similares

J Am Med Inform Assoc - Temporal reasoning over clinical text: the state of the art. ( 0,697557772297451 )
J Biomed Inform - Approaches to verb subcategorization for biomedicine. ( 0,661639125911539 )
J Biomed Inform - Biomedical text mining and its applications in cancer research. ( 0,660030684611782 )
J Biomed Inform - Natural language processing: state of the art and prospects for significant progress, a workshop sponsored by the National Library of Medicine. ( 0,646695626872969 )
BMC Med Inform Decis Mak - GenDrux: a biomedical literature search system to identify gene expression-based drug sensitivity in breast cancer. ( 0,625406248349972 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,600154214559611 )
Res Synth Methods - The evolution of a new publication type: Steps and challenges of producing overviews of reviews. ( 0,600042574684142 )
J Am Med Inform Assoc - Pathology imaging informatics for quantitative analysis of whole-slide images. ( 0,598665890364593 )
Int J Med Inform - A methodology to enhance spatial understanding of disease outbreak events reported in news articles. ( 0,597059522343087 )
J Biomed Inform - Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. ( 0,596829186752064 )
J Clin Monit Comput - Translational applications of evaluating physiologic variability in human endotoxemia. ( 0,596578977234989 )
AMIA Annu Symp Proc - BioSimplify: an open source sentence simplification engine to improve recall in automatic biomedical information extraction. ( 0,595276704024569 )
Med Decis Making - Automatically annotating topics in transcripts of patient-provider interactions via machine learning. ( 0,592682777380347 )
Telemed J E Health - Application of health technology in humanitarian response: U.S. Military deployed health technology summit--a summary. ( 0,58874212604857 )
J Biomed Inform - Text summarization in the biomedical domain: a systematic review of recent research. ( 0,586470324946613 )
J Biomed Inform - Knowledge based word-concept model estimation and refinement for biomedical text mining. ( 0,582247013538323 )
AMIA Annu Symp Proc - Evaluating the Importance of Image-related Text for Ad-hoc and Case-based Biomedical Article Retrieval. ( 0,58071846683373 )
J Biomed Inform - Degree centrality for semantic abstraction summarization of therapeutic studies. ( 0,575772051666173 )
Telemed J E Health - The patient-centered medical home and health information technology. ( 0,575084435471504 )
J Biomed Inform - UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text. ( 0,573353338899834 )
J Biomed Inform - Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. ( 0,57183441911898 )
BMC Med Inform Decis Mak - Semantic text mining support for lignocellulose research. ( 0,569503090250875 )
J Biomed Inform - Anaphoric reference in clinical reports: characteristics of an annotated corpus. ( 0,56936233893264 )
J Am Med Inform Assoc - A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge. ( 0,568532285542872 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,566088907087293 )
BMC Med Inform Decis Mak - Mining biomarker information in biomedical literature. ( 0,565791250232331 )
Res Synth Methods - Meta-research: The art of getting it wrong. ( 0,564642177682804 )
IEEE Trans Neural Netw Learn Syst - Kernel association for classification and prediction: a survey. ( 0,562681554391918 )
Int J Med Inform - Socio-technical issues and challenges in implementing safe patient handovers: insights from ethnographic case studies. ( 0,562637390369269 )
AMIA Annu Symp Proc - Part-of-speech tagging for clinical text: wall or bridge between institutions? ( 0,559225387277246 )
J Biomed Inform - Selecting information in electronic health records for knowledge acquisition. ( 0,556781346193065 )
Brief. Bioinformatics - Modern bioinformatics meets traditional Chinese medicine. ( 0,556300716732047 )
Int J Med Inform - Telemedicine--a bibliometric and content analysis of 17,932 publication records. ( 0,553657781921156 )
J Am Med Inform Assoc - Recommending MeSH terms for annotating biomedical articles. ( 0,553565707676851 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,552798052983454 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,551756073399335 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,549214786674693 )
Int J Med Inform - Implementation science approaches for integrating eHealth research into practice and policy. ( 0,548667826246018 )
J. Med. Internet Res. - Issues in mHealth: findings from key informant interviews. ( 0,548142677659956 )
Comput Math Methods Med - Automated bone age assessment: motivation, taxonomies, and challenges. ( 0,546474189838334 )
J Med Syst - A review of tags anti-collision and localization protocols in RFID networks. ( 0,545870416316024 )
Med Decis Making - A note on the expected biases in conventional iterative health state valuation protocols. ( 0,544784498626732 )
J Am Med Inform Assoc - Automated concept-level information extraction to reduce the need for custom software and rules development. ( 0,544499511859673 )
BMC Med Inform Decis Mak - Technical challenges of providing record linkage services for research. ( 0,544171476999214 )
J Biomed Inform - Time motion studies in healthcare: what are we talking about? ( 0,54411404449543 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,543747029152815 )
AMIA Annu Symp Proc - The Lexicon Builder Web service: Building Custom Lexicons from two hundred Biomedical Ontologies. ( 0,542170946096639 )
J Biomed Inform - Text mining for traditional Chinese medical knowledge discovery: a survey. ( 0,542020571409118 )
AMIA Annu Symp Proc - Active Learning-based corpus annotation--the PathoJen experience. ( 0,541011221238262 )
J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,540198037706822 )
AMIA Annu Symp Proc - Pharmacovigilance on twitter? Mining tweets for adverse drug reactions. ( 0,539946619325805 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,539629783527188 )
J Am Med Inform Assoc - Extracting drug indication information from structured product labels using natural language processing. ( 0,539391946164972 )
AMIA Annu Symp Proc - Extracting patient demographics and personal medical information from online health forums. ( 0,537710940260976 )
J Biomed Inform - Ontology modularization to improve semantic medical image annotation. ( 0,53767655275068 )
AMIA Annu Symp Proc - Sophia: A Expedient UMLS Concept Extraction Annotator. ( 0,535949724827978 )
J. Med. Internet Res. - Developing a disease outbreak event corpus. ( 0,533876244129567 )
J Am Med Inform Assoc - Recent trends in biomedical informatics: a study based on JAMIA articles. ( 0,53370598131072 )
AMIA Annu Symp Proc - TextHunter--A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research. ( 0,533160203526352 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,531805374983846 )
J Am Med Inform Assoc - Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. ( 0,531168643513178 )
AMIA Annu Symp Proc - It's about this and that: a description of anaphoric expressions in clinical text. ( 0,530190927155504 )
J Biomed Inform - Automatically extracting information needs from complex clinical questions. ( 0,529633309405384 )
J Biomed Inform - Using a shallow linguistic kernel for drug-drug interaction extraction. ( 0,527564897531694 )
J Am Med Inform Assoc - Induced lexico-syntactic patterns improve information extraction from online medical forums. ( 0,527448411572267 )
J Am Med Inform Assoc - Toward a science of learning systems: a research agenda for the high-functioning Learning Health System. ( 0,527178414817809 )
AMIA Annu Symp Proc - Informing standard development and understanding user needs with omaha system signs and symptoms text entries in community-based care settings. ( 0,524872353392131 )
J Biomed Inform - Relation mining experiments in the pharmacogenomics domain. ( 0,524506149738833 )
Brief. Bioinformatics - Measuring the microbiome: perspectives on advances in DNA-based techniques for exploring microbial life. ( 0,524448570196621 )
AMIA Annu Symp Proc - A machine learning approach for identifying anatomical locations of actionable findings in radiology reports. ( 0,524266192065376 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,524244011347632 )
J Am Med Inform Assoc - BoB, a best-of-breed automated text de-identification system for VHA clinical documents. ( 0,523380412714244 )
Comput. Biol. Med. - Parsing citations in biomedical articles using conditional random fields. ( 0,523152775940429 )
Comput Math Methods Med - Retracted: Bayes clustering and structural support vector machines for segmentation of carotid artery plaques in multicontrast MRI. ( 0,521658950142383 )
Artif Intell Med - Recommendations for the ethical use and design of artificial intelligent care providers. ( 0,520808906871111 )
J Am Med Inform Assoc - A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. ( 0,520182586038684 )
Comput Math Methods Med - Ranking biomedical annotations with annotator's semantic relevancy. ( 0,518899189517707 )
AMIA Annu Symp Proc - Generalizability and comparison of automatic clinical text de-identification methods and resources. ( 0,518537458796446 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,51820766534944 )
J Am Med Inform Assoc - An integrative framework for sensor-based measurement of teamwork in healthcare. ( 0,516982052300663 )
AMIA Annu Symp Proc - Inferring the semantic relationships of words within an ontology using random indexing: applications to pharmacogenomics. ( 0,516452939393306 )
J Am Med Inform Assoc - Advanced networks and computing in healthcare. ( 0,516434136249089 )
J Am Med Inform Assoc - Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. ( 0,516413080849334 )
AMIA Annu Symp Proc - Hyperdimensional computing approach to word sense disambiguation. ( 0,5162127248983 )
J Am Med Inform Assoc - Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification. ( 0,516117675474089 )
AMIA Annu Symp Proc - Using language models to identify relevant new information in inpatient clinical notes. ( 0,515775348521823 )
J Biomed Inform - Towards generating a patient's timeline: extracting temporal relationships from clinical notes. ( 0,515298891541796 )
J Biomed Inform - Automated curation of gene name normalization results using the Konstanz information miner. ( 0,515215166443034 )
J Biomed Inform - Detecting hedge cues and their scope in biomedical text with conditional random fields. ( 0,514766323572164 )
J Am Med Inform Assoc - MedXN: an open source medication extraction and normalization tool for clinical text. ( 0,514354230118933 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,513624785801933 )
Brief. Bioinformatics - Environmental bio-monitoring with high-throughput sequencing. ( 0,513610897106615 )
J Biomed Inform - NCBI disease corpus: a resource for disease name recognition and concept normalization. ( 0,512788328064206 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,512764985242506 )
BMC Med Inform Decis Mak - Dynamic summarization of bibliographic-based data. ( 0,512665181798936 )
Brief. Bioinformatics - The automatic annotation of bacterial genomes. ( 0,512570657609503 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,512156300749322 )
Comput. Biol. Med. - A P300-based brain computer interface system for words typing. ( 0,511978944629216 )
AMIA Annu Symp Proc - Using UMLS lexical resources to disambiguate abbreviations in clinical text. ( 0,510317715923303 )
Sci Data - Building the graph of medicine from millions of clinical narratives. ( 0,509831364210698 )