J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions.

Tópicos

{ extract(1171) text(1153) clinic(932) }
{ learn(2355) train(1041) set(1003) }
{ howev(809) still(633) remain(590) }
{ clinic(1479) use(1117) guidelin(835) }
{ search(2224) databas(1162) retriev(909) }
{ detect(2391) sensit(1101) algorithm(908) }
{ care(1570) inform(1187) nurs(1089) }
{ data(3963) clinic(1234) research(1004) }
{ framework(1458) process(801) describ(734) }
{ design(1359) user(1324) use(1319) }
{ model(2341) predict(2261) use(1141) }
{ activ(1138) subject(705) human(624) }
{ use(1733) differ(960) four(931) }
{ implement(1333) system(1263) develop(1122) }
{ method(2212) result(1239) propos(1039) }
{ featur(3375) classif(2383) classifi(1994) }
{ assess(1506) score(1403) qualiti(1306) }
{ concept(1167) ontolog(924) domain(897) }
{ algorithm(1844) comput(1787) effici(935) }
{ import(1318) role(1303) understand(862) }
{ research(1218) medic(880) student(794) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ can(774) often(719) complex(702) }
{ system(1976) rule(880) can(841) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ motion(1329) object(1292) video(1091) }
{ data(1714) softwar(1251) tool(1186) }
{ research(1085) discuss(1038) issu(1018) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ record(1888) medic(1808) patient(1693) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ cost(1906) reduc(1198) effect(832) }
{ first(2504) two(1366) second(1323) }
{ use(976) code(926) identifi(902) }
{ process(1125) use(805) approach(778) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ method(1557) propos(1049) approach(1037) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }

Resumo

The management of drug-drug interactions (DDIs) is a critical issue resulting from the overwhelming amount of information available on them. Natural Language Processing (NLP) techniques can provide an interesting way to reduce the time spent by healthcare professionals on reviewing biomedical literature. However, NLP techniques rely mostly on the availability of the annotated corpora. While there are several annotated corpora with biological entities and their relationships, there is a lack of corpora annotated with pharmacological substances and DDIs. Moreover, other works in this field have focused in pharmacokinetic (PK) DDIs only, but not in pharmacodynamic (PD) DDIs. To address this problem, we have created a manually annotated corpus consisting of 792 texts selected from the DrugBank database and other 233 Medline abstracts. This fined-grained corpus has been annotated with a total of 18,502 pharmacological substances and 5028 DDIs, including both PK as well as PD interactions. The quality and consistency of the annotation process has been ensured through the creation of annotation guidelines and has been evaluated by the measurement of the inter-annotator agreement between two annotators. The agreement was almost perfect (Kappa up to 0.96 and generally over 0.80), except for the DDIs in the MedLine database (0.55-0.72). The DDI corpus has been used in the SemEval 2013 DDIExtraction challenge as a gold standard for the evaluation of information extraction techniques applied to the recognition of pharmacological substances and the detection of DDIs from biomedical texts. DDIExtraction 2013 has attracted wide attention with a total of 14 teams from 7 different countries. For the task of recognition and classification of pharmacological names, the best system achieved an F1 of 71.5%, while, for the detection and classification of DDIs, the best result was F1 of 65.1%. These results show that the corpus has enough quality to be used for training and testing NLP techniques applied to the field of Pharmacovigilance. The DDI corpus and the annotation guidelines are free for use for academic research and are available at http://labda.inf.uc3m.es/ddicorpus.

Resumo Limpo

manag drugdrug interact ddis critic issu result overwhelm amount inform avail natur languag process nlp techniqu can provid interest way reduc time spent healthcar profession review biomed literatur howev nlp techniqu reli most avail annot corpora sever annot corpora biolog entiti relationship lack corpora annot pharmacolog substanc ddis moreov work field focus pharmacokinet pk ddis pharmacodynam pd ddis address problem creat manual annot corpus consist text select drugbank databas medlin abstract finedgrain corpus annot total pharmacolog substanc ddis includ pk well pd interact qualiti consist annot process ensur creation annot guidelin evalu measur interannot agreement two annot agreement almost perfect kappa general except ddis medlin databas ddi corpus use semev ddiextract challeng gold standard evalu inform extract techniqu appli recognit pharmacolog substanc detect ddis biomed text ddiextract attract wide attent total team differ countri task recognit classif pharmacolog name best system achiev f detect classif ddis best result f result show corpus enough qualiti use train test nlp techniqu appli field pharmacovigil ddi corpus annot guidelin free use academ research avail httplabdainfucmesddicorpus

Resumos Similares

J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,839518008771549 )
J Am Med Inform Assoc - An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. ( 0,83192427754001 )
AMIA Annu Symp Proc - Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining. ( 0,830279312977791 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,828617263751944 )
J Biomed Inform - Towards generating a patient's timeline: extracting temporal relationships from clinical notes. ( 0,82855829598153 )
J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,820763269479813 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,819889628979053 )
J. Med. Internet Res. - Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. ( 0,816305442323876 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,814718360875501 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,812516718863032 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,810571965435722 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,80961374270816 )
AMIA Annu Symp Proc - Inter-annotator reliability of medical events, coreferences and temporal relations in clinical narratives by annotators with varying levels of clinical expertise. ( 0,806840793723354 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,804672295470533 )
J Biomed Inform - MedTime: a temporal information extraction system for clinical narratives. ( 0,803948581700454 )
J Biomed Inform - A new clustering method for detecting rare senses of abbreviations in clinical notes. ( 0,803933754958493 )
J Am Med Inform Assoc - Anaphoric relations in the clinical narrative: corpus creation. ( 0,80315849323627 )
Comput Methods Programs Biomed - BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments. ( 0,802729059735428 )
J Am Med Inform Assoc - Eventual situations for timeline extraction from clinical reports. ( 0,802397885498232 )
AMIA Annu Symp Proc - Automatically pairing measured findings across narrative abdomen CT reports. ( 0,801030010543371 )
J Am Med Inform Assoc - Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. ( 0,799592047338206 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,797463910193131 )
J Biomed Inform - Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. ( 0,796947692350903 )
J Biomed Inform - Lexical patterns, features and knowledge resources for coreference resolution in clinical notes. ( 0,794623384634553 )
J Am Med Inform Assoc - A hybrid system for temporal information extraction from clinical text. ( 0,794115687593668 )
J Biomed Inform - A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora. ( 0,793001391642377 )
J Biomed Inform - Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. ( 0,792716538460194 )
J Am Med Inform Assoc - Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text. ( 0,788686943848486 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,786254555497967 )
Int J Med Inform - Detecting temporal expressions in medical narratives. ( 0,786158468845732 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,784419394386613 )
J Med Syst - Redactable signatures for signed CDA Documents. ( 0,782936534642032 )
AMIA Annu Symp Proc - Towards a semantic lexicon for clinical natural language processing. ( 0,780842349518982 )
J Biomed Inform - Text de-identification for privacy protection: a study of its impact on clinical text information content. ( 0,777935253748098 )
Int J Med Inform - Detection of infectious symptoms from VA emergency department and primary care clinical documentation. ( 0,777805388176026 )
J Am Med Inform Assoc - MedXN: an open source medication extraction and normalization tool for clinical text. ( 0,776059982134736 )
AMIA Annu Symp Proc - Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations. ( 0,774840130851196 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,773923494636322 )
J Biomed Inform - Anaphoric reference in clinical reports: characteristics of an annotated corpus. ( 0,770213177840559 )
Comput Math Methods Med - Ranking biomedical annotations with annotator's semantic relevancy. ( 0,770091233143622 )
Appl Clin Inform - Representation of information about family relatives as structured data in electronic health records. ( 0,769890220089588 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,767520502713689 )
J Am Med Inform Assoc - Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences. ( 0,767476126636515 )
J Biomed Inform - UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text. ( 0,767332759005502 )
J Am Med Inform Assoc - Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. ( 0,765735267039804 )
Int J Med Inform - A methodology to enhance spatial understanding of disease outbreak events reported in news articles. ( 0,765355527997112 )
J Am Med Inform Assoc - A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. ( 0,765170740542701 )
J Am Med Inform Assoc - Automated concept-level information extraction to reduce the need for custom software and rules development. ( 0,764689098395597 )
AMIA Annu Symp Proc - Extracting Concepts Related to Homelessness from the Free Text of VA Electronic Medical Records. ( 0,764148031030673 )
J Integr Bioinform - Automatic extraction of microorganisms and their habitats from free text using text mining workflows. ( 0,761696057104631 )
J Am Med Inform Assoc - Towards comprehensive syntactic and semantic annotations of the clinical narrative. ( 0,759836120617847 )
J Biomed Inform - Annotating temporal information in clinical narratives. ( 0,756874160153883 )
Int J Med Inform - Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. ( 0,756311572712743 )
J Biomed Inform - Approaches to verb subcategorization for biomedicine. ( 0,755912237031984 )
AMIA Annu Symp Proc - Natural language processing for lines and devices in portable chest x-rays. ( 0,755427995877466 )
AMIA Annu Symp Proc - Mapping annotations with textual evidence using an scLDA model. ( 0,7549156504339 )
J Biomed Inform - Text summarization in the biomedical domain: a systematic review of recent research. ( 0,753170282521061 )
J Biomed Inform - NCBI disease corpus: a resource for disease name recognition and concept normalization. ( 0,752467089583619 )
J Am Med Inform Assoc - Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. ( 0,752422220673584 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,751254360541126 )
Brief. Bioinformatics - A survey on annotation tools for the biomedical literature. ( 0,75098216520036 )
J Am Med Inform Assoc - Using machine learning for concept extraction on clinical documents from multiple data sources. ( 0,750646859917376 )
J Biomed Inform - Automatically extracting information needs from complex clinical questions. ( 0,749039249172634 )
J Biomed Inform - Desiderata for ontologies to be used in semantic annotation of biomedical documents. ( 0,748535972252947 )
AMIA Annu Symp Proc - Building gold standard corpora for medical natural language processing tasks. ( 0,747973578860219 )
J Biomed Inform - Identifying non-elliptical entity mentions in a coordinated NP with ellipses. ( 0,746870338913342 )
J Biomed Inform - Extraction of events and temporal expressions from clinical narratives. ( 0,745987319711126 )
Comput. Biol. Med. - Parsing citations in biomedical articles using conditional random fields. ( 0,745877124571532 )
J Am Med Inform Assoc - Automatic abstraction of imaging observations with their characteristics from mammography reports. ( 0,745611876727406 )
AMIA Annu Symp Proc - Automated illustration of patients instructions. ( 0,744995899447195 )
J Am Med Inform Assoc - A comprehensive study of named entity recognition in Chinese clinical text. ( 0,744322928923688 )
J Am Med Inform Assoc - A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge. ( 0,74408432654638 )
J Biomed Inform - Detecting hedge cues and their scope in biomedical text with conditional random fields. ( 0,744008090605596 )
AMIA Annu Symp Proc - Parenthetically speaking: classifying the contents of parentheses for text mining. ( 0,743128127086173 )
J Am Med Inform Assoc - Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. ( 0,741978824181667 )
AMIA Annu Symp Proc - Sophia: A Expedient UMLS Concept Extraction Annotator. ( 0,739010183557265 )
J Biomed Inform - Semantator: semantic annotator for converting biomedical text to linked data. ( 0,738128548604253 )
AMIA Annu Symp Proc - Extracting patient demographics and personal medical information from online health forums. ( 0,735688478924182 )
J Am Med Inform Assoc - Induced lexico-syntactic patterns improve information extraction from online medical forums. ( 0,735091413671438 )
AMIA Annu Symp Proc - It's about this and that: a description of anaphoric expressions in clinical text. ( 0,734780507144024 )
J Am Med Inform Assoc - Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification. ( 0,733653797092386 )
AMIA Annu Symp Proc - Pharmacovigilance on twitter? Mining tweets for adverse drug reactions. ( 0,733308748161921 )
J Biomed Inform - Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus. ( 0,732631589958366 )
AMIA Annu Symp Proc - Using UMLS lexical resources to disambiguate abbreviations in clinical text. ( 0,73053085122228 )
AMIA Annu Symp Proc - Voice-dictated versus typed-in clinician notes: linguistic properties and the potential implications on natural language processing. ( 0,729450120514075 )
J Am Med Inform Assoc - The effect of word familiarity on actual and perceived text difficulty. ( 0,728959815585348 )
AMIA Annu Symp Proc - Part-of-speech tagging for clinical text: wall or bridge between institutions? ( 0,72786501560407 )
J Am Med Inform Assoc - A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. ( 0,727402561086535 )
J. Med. Internet Res. - Biomedical informatics techniques for processing and analyzing web blogs of military service members. ( 0,725599330428005 )
BMC Med Inform Decis Mak - Text summarization as a decision support aid. ( 0,72558905309053 )
J Am Med Inform Assoc - Large-scale evaluation of automated clinical note de-identification and its impact on information extraction. ( 0,725523046367791 )
AMIA Annu Symp Proc - Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method. ( 0,725142483269773 )
J Biomed Inform - Knowledge based word-concept model estimation and refinement for biomedical text mining. ( 0,723668554149033 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,723647469156993 )
Perspect Health Inf Manag - A comparison of two approaches to text processing: facilitating chart reviews of radiology reports in electronic medical records. ( 0,721841427227021 )
J Am Med Inform Assoc - Extracting drug indication information from structured product labels using natural language processing. ( 0,721797155337649 )
J Am Med Inform Assoc - A knowledge discovery and reuse pipeline for information extraction in clinical notes. ( 0,721112563606484 )
AMIA Annu Symp Proc - Critical finding capture in the impression section of radiology reports. ( 0,72088083642532 )
Comput Methods Programs Biomed - Marky: a tool supporting annotation consistency in multi-user and iterative document annotation projects. ( 0,719239790568382 )
J Am Med Inform Assoc - Assisted annotation of medical free text using RapTAT. ( 0,717971117493685 )