J. Med. Internet Res. - Developing a disease outbreak event corpus.

Tópicos

{ extract(1171) text(1153) clinic(932) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ imag(1947) propos(1133) code(1026) }
{ design(1359) user(1324) use(1319) }
{ research(1085) discuss(1038) issu(1018) }
{ error(1145) method(1030) estim(1020) }
{ inform(2794) health(2639) internet(1427) }
{ featur(3375) classif(2383) classifi(1994) }
{ health(3367) inform(1360) care(1135) }
{ method(1219) similar(1157) match(930) }
{ network(2748) neural(1063) input(814) }
{ case(1353) use(1143) diagnosi(1136) }
{ model(2341) predict(2261) use(1141) }
{ studi(1119) effect(1106) posit(819) }
{ result(1111) use(1088) new(759) }
{ howev(809) still(633) remain(590) }
{ import(1318) role(1303) understand(862) }
{ time(1939) patient(1703) rate(768) }
{ can(981) present(881) function(850) }
{ imag(2675) segment(2577) method(1081) }
{ framework(1458) process(801) describ(734) }
{ featur(1941) imag(1645) propos(1176) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ spatial(1525) area(1432) region(1030) }
{ age(1611) year(1155) adult(843) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ use(2086) technolog(871) perceiv(783) }
{ use(976) code(926) identifi(902) }
{ activ(1452) weight(1219) physic(1104) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ imag(2830) propos(1344) filter(1198) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ studi(1410) differ(1259) use(1210) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ patient(1821) servic(1111) care(1106) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

CKGROUND: In recent years, there has been a growth in work on the use of information extraction technologies for tracking disease outbreaks from online news texts, yet publicly available evaluation standards (and associated resources) for this new area of research have been noticeably lacking.OBJECTIVE: This study seeks to create a "gold standard" data set against which to test how accurately disease outbreak information extraction systems can identify the semantics of disease outbreak events. Additionally, we hope that the provision of an annotation scheme (and associated corpus) to the community will encourage open evaluation in this new and growing application area.METHODS: We developed an annotation scheme for identifying infectious disease outbreak events in news texts. An event--in the context of our annotation scheme--consists minimally of geographical (eg, country and province) and disease name information. However, the scheme also allows for the rich encoding of other domain salient concepts (eg, international travel, species, and food contamination).RESULTS: The work resulted in a 200-document corpus of event-annotated disease outbreak reports that can be used to evaluate the accuracy of event detection algorithms (in this case, for the BioCaster biosurveillance online news information extraction system). In the 200 documents, 394 distinct events were identified (mean 1.97 events per document, range 0-25 events per document). We also provide a download script and graphical user interface (GUI)-based event browsing software to facilitate corpus exploration.CONCLUSION: In summary, we present an annotation scheme and corpus that can be used in the evaluation of disease outbreak event extraction algorithms. The annotation scheme and corpus were designed both with the particular evaluation requirements of the BioCaster system in mind as well as the wider need for further evaluation resources in this growing research area.

Resumo Limpo

ckground recent year growth work use inform extract technolog track diseas outbreak onlin news text yet public avail evalu standard associ resourc new area research notic lackingobject studi seek creat gold standard data set test accur diseas outbreak inform extract system can identifi semant diseas outbreak event addit hope provis annot scheme associ corpus communiti will encourag open evalu new grow applic areamethod develop annot scheme identifi infecti diseas outbreak event news text eventin context annot schemeconsist minim geograph eg countri provinc diseas name inform howev scheme also allow rich encod domain salient concept eg intern travel speci food contaminationresult work result document corpus eventannot diseas outbreak report can use evalu accuraci event detect algorithm case biocast biosurveil onlin news inform extract system document distinct event identifi mean event per document rang event per document also provid download script graphic user interfac guibas event brows softwar facilit corpus explorationconclus summari present annot scheme corpus can use evalu diseas outbreak event extract algorithm annot scheme corpus design particular evalu requir biocast system mind well wider need evalu resourc grow research area

Resumos Similares

J Biomed Inform - NCBI disease corpus: a resource for disease name recognition and concept normalization. ( 0,796706245455348 )
J Biomed Inform - Anaphoric reference in clinical reports: characteristics of an annotated corpus. ( 0,748042538014637 )
J Biomed Inform - Text summarization in the biomedical domain: a systematic review of recent research. ( 0,744345933612274 )
AMIA Annu Symp Proc - Automatically pairing measured findings across narrative abdomen CT reports. ( 0,73762660401084 )
Comput Math Methods Med - Ranking biomedical annotations with annotator's semantic relevancy. ( 0,735730065318376 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,73526554953137 )
J Am Med Inform Assoc - Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text. ( 0,734626522138563 )
J Biomed Inform - UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text. ( 0,727347415319932 )
J Biomed Inform - Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. ( 0,724096508922465 )
AMIA Annu Symp Proc - Inter-annotator reliability of medical events, coreferences and temporal relations in clinical narratives by annotators with varying levels of clinical expertise. ( 0,72404531672512 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,723652983775163 )
J Am Med Inform Assoc - Vaccine adverse event text mining system for extracting features from vaccine safety reports. ( 0,718331643644341 )
J. Med. Internet Res. - Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. ( 0,717696484605632 )
J Biomed Inform - MedTime: a temporal information extraction system for clinical narratives. ( 0,716892500329875 )
J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,715971596280519 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,714351519668434 )
J Am Med Inform Assoc - Using rule-based natural language processing to improve disease normalization in biomedical text. ( 0,714287470148144 )
J Biomed Inform - A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora. ( 0,71324112369928 )
Int J Med Inform - A methodology to enhance spatial understanding of disease outbreak events reported in news articles. ( 0,712586840545253 )
J Am Med Inform Assoc - An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. ( 0,712128485223607 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,710925914465167 )
Int J Med Inform - Detecting temporal expressions in medical narratives. ( 0,710444470327465 )
J Am Med Inform Assoc - A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. ( 0,709938204506174 )
Appl Clin Inform - Representation of information about family relatives as structured data in electronic health records. ( 0,709619860149874 )
Sci Data - Building the graph of medicine from millions of clinical narratives. ( 0,707982333071519 )
J Am Med Inform Assoc - Eventual situations for timeline extraction from clinical reports. ( 0,707187240787003 )
J Am Med Inform Assoc - Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. ( 0,706624759436161 )
AMIA Annu Symp Proc - Building gold standard corpora for medical natural language processing tasks. ( 0,703253549252264 )
J Biomed Inform - Comparison of automated and human assignment of MeSH terms on publicly-available molecular datasets. ( 0,702205400065721 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,70082950438647 )
Brief. Bioinformatics - A survey on annotation tools for the biomedical literature. ( 0,700494855318264 )
Health Informatics J - University of California, Irvine-Pathology Extraction Pipeline: the pathology extraction pipeline for information extraction from pathology reports. ( 0,698559847138138 )
AMIA Annu Symp Proc - Natural language processing for lines and devices in portable chest x-rays. ( 0,69852604386403 )
J Med Syst - Redactable signatures for signed CDA Documents. ( 0,697831334275937 )
AMIA Annu Symp Proc - The Lexicon Builder Web service: Building Custom Lexicons from two hundred Biomedical Ontologies. ( 0,697222619158982 )
AMIA Annu Symp Proc - Pharmacovigilance on twitter? Mining tweets for adverse drug reactions. ( 0,696904791212871 )
J Biomed Inform - Towards generating a patient's timeline: extracting temporal relationships from clinical notes. ( 0,694885663020528 )
J Am Med Inform Assoc - A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge. ( 0,694266029823015 )
J Biomed Inform - Extraction of events and temporal expressions from clinical narratives. ( 0,693634204989885 )
J Am Med Inform Assoc - A hybrid system for temporal information extraction from clinical text. ( 0,693114195218476 )
J Biomed Inform - Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study. ( 0,693000761941568 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,691690461671451 )
J Am Med Inform Assoc - Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. ( 0,691629653990111 )
J Biomed Inform - Approaches to verb subcategorization for biomedicine. ( 0,691156577571309 )
BMC Med Inform Decis Mak - Text summarization as a decision support aid. ( 0,689832025492839 )
J Am Med Inform Assoc - Extracting drug indication information from structured product labels using natural language processing. ( 0,688723188526758 )
J Biomed Inform - Ontology modularization to improve semantic medical image annotation. ( 0,687490330397993 )
AMIA Annu Symp Proc - Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining. ( 0,684621304641479 )
J Am Med Inform Assoc - Temporal reasoning over clinical text: the state of the art. ( 0,684226322757769 )
J Am Med Inform Assoc - MedXN: an open source medication extraction and normalization tool for clinical text. ( 0,684016682317647 )
Comput Methods Programs Biomed - BioAnnote: a software platform for annotating biomedical documents with application in medical learning environments. ( 0,682773672754758 )
J Am Med Inform Assoc - Anaphoric relations in the clinical narrative: corpus creation. ( 0,682505467037968 )
J Am Med Inform Assoc - A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. ( 0,68210530715179 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,682061562444512 )
Int J Med Inform - Detection of infectious symptoms from VA emergency department and primary care clinical documentation. ( 0,680643493815212 )
J Biomed Inform - A new clustering method for detecting rare senses of abbreviations in clinical notes. ( 0,679085326805984 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,676395120725828 )
AMIA Annu Symp Proc - A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries. ( 0,674167435756999 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,673960292750881 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,672550536724281 )
J Am Med Inform Assoc - Automatic abstraction of imaging observations with their characteristics from mammography reports. ( 0,672332413425144 )
AMIA Annu Symp Proc - Extracting patient demographics and personal medical information from online health forums. ( 0,672099559041861 )
J Am Med Inform Assoc - The effect of word familiarity on actual and perceived text difficulty. ( 0,670326172607705 )
J Biomed Inform - Annotating temporal information in clinical narratives. ( 0,670049969958434 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,667710159193906 )
AMIA Annu Symp Proc - Mapping annotations with textual evidence using an scLDA model. ( 0,666395422697274 )
J Biomed Inform - Automatically extracting information needs from complex clinical questions. ( 0,666264487582981 )
AMIA Annu Symp Proc - De-identification of Address, Date, and Alphanumeric Identifiers in Narrative Clinical Reports. ( 0,665998680369161 )
AMIA Annu Symp Proc - It's about this and that: a description of anaphoric expressions in clinical text. ( 0,665935285543386 )
J Biomed Inform - Degree centrality for semantic abstraction summarization of therapeutic studies. ( 0,665229180678321 )
AMIA Annu Symp Proc - Syndromic surveillance in an ICD-10 world. ( 0,66417745389168 )
J Biomed Inform - Secondary use of electronic health records for building cohort studies through top-down information extraction. ( 0,661245884109243 )
J Biomed Inform - Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. ( 0,659234382382186 )
J Am Med Inform Assoc - Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. ( 0,659115853158635 )
J Biomed Inform - Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. ( 0,658403939948122 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,658043250478844 )
J Am Med Inform Assoc - Automated concept-level information extraction to reduce the need for custom software and rules development. ( 0,657588178195731 )
Perspect Health Inf Manag - A comparison of two approaches to text processing: facilitating chart reviews of radiology reports in electronic medical records. ( 0,656512157635077 )
AMIA Annu Symp Proc - Towards a semantic lexicon for clinical natural language processing. ( 0,655396381261938 )
J Am Med Inform Assoc - Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries. ( 0,654345134428983 )
J Biomed Inform - Common data model for natural language processing based on two existing standard information models: CDA+GrAF. ( 0,653160888711261 )
J Biomed Inform - A natural language processing pipeline for pairing measurements uniquely across free-text CT reports. ( 0,651385844561463 )
J Biomed Inform - Desiderata for ontologies to be used in semantic annotation of biomedical documents. ( 0,651201651428637 )
AMIA Annu Symp Proc - Extracting Concepts Related to Homelessness from the Free Text of VA Electronic Medical Records. ( 0,650957027816457 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,650734533944631 )
Comput Methods Programs Biomed - Marky: a tool supporting annotation consistency in multi-user and iterative document annotation projects. ( 0,649530563997113 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,649022167573653 )
J Biomed Inform - Lexical patterns, features and knowledge resources for coreference resolution in clinical notes. ( 0,64756946394831 )
AMIA Annu Symp Proc - A Knowledge Intensive Approach to Mapping Clinical Narrative to LOINC. ( 0,646433431393157 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,646259819610442 )
J Am Med Inform Assoc - Assisted annotation of medical free text using RapTAT. ( 0,645397125054307 )
AMIA Annu Symp Proc - Voice-dictated versus typed-in clinician notes: linguistic properties and the potential implications on natural language processing. ( 0,643534275754295 )
J Biomed Inform - Text de-identification for privacy protection: a study of its impact on clinical text information content. ( 0,642296440851531 )
Int J Med Inform - Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. ( 0,642201893512055 )
J Am Med Inform Assoc - Towards comprehensive syntactic and semantic annotations of the clinical narrative. ( 0,641962545046151 )
J Biomed Inform - Identifying non-elliptical entity mentions in a coordinated NP with ellipses. ( 0,641633326421696 )
J Am Med Inform Assoc - Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences. ( 0,637827586244647 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,63756066234169 )
AMIA Annu Symp Proc - Discovering peripheral arterial disease cases from radiology notes using natural language processing. ( 0,637520850250762 )
J Integr Bioinform - Automatic extraction of microorganisms and their habitats from free text using text mining workflows. ( 0,636509225650515 )