J Biomed Inform - Unsupervised biomedical named entity recognition: experiments with clinical and biological texts.

Tópicos

{ extract(1171) text(1153) clinic(932) }
{ concept(1167) ontolog(924) domain(897) }
{ method(1219) similar(1157) match(930) }
{ research(1085) discuss(1038) issu(1018) }
{ use(976) code(926) identifi(902) }
{ general(901) number(790) one(736) }
{ compound(1573) activ(1297) structur(1058) }
{ implement(1333) system(1263) develop(1122) }
{ featur(3375) classif(2383) classifi(1994) }
{ learn(2355) train(1041) set(1003) }
{ process(1125) use(805) approach(778) }
{ system(1976) rule(880) can(841) }
{ imag(2830) propos(1344) filter(1198) }
{ data(3963) clinic(1234) research(1004) }
{ signal(2180) analysi(812) frequenc(800) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ imag(1057) registr(996) error(939) }
{ imag(2675) segment(2577) method(1081) }
{ framework(1458) process(801) describ(734) }
{ chang(1828) time(1643) increas(1301) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ blood(1257) pressur(1144) flow(957) }
{ ehr(2073) health(1662) electron(1139) }
{ cost(1906) reduc(1198) effect(832) }
{ gene(2352) biolog(1181) express(1162) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ drug(1928) target(777) effect(648) }
{ method(2212) result(1239) propos(1039) }
{ can(774) often(719) complex(702) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ network(2748) neural(1063) input(814) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ clinic(1479) use(1117) guidelin(835) }
{ design(1359) user(1324) use(1319) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Named entity recognition is a crucial component of biomedical natural language processing, enabling information extraction and ultimately reasoning over and knowledge discovery from text. Much progress has been made in the design of rule-based and supervised tools, but they are often genre and task dependent. As such, adapting them to different genres of text or identifying new types of entities requires major effort in re-annotation or rule development. In this paper, we propose an unsupervised approach to extracting named entities from biomedical text. We describe a stepwise solution to tackle the challenges of entity boundary detection and entity type classification without relying on any handcrafted rules, heuristics, or annotated data. A noun phrase chunker followed by a filter based on inverse document frequency extracts candidate entities from free text. Classification of candidate entities into categories of interest is carried out by leveraging principles from distributional semantics. Experiments show that our system, especially the entity classification step, yields competitive results on two popular biomedical datasets of clinical notes and biological literature, and outperforms a baseline dictionary match approach. Detailed error analysis provides a road map for future work.

Resumo Limpo

name entiti recognit crucial compon biomed natur languag process enabl inform extract ultim reason knowledg discoveri text much progress made design rulebas supervis tool often genr task depend adapt differ genr text identifi new type entiti requir major effort reannot rule develop paper propos unsupervis approach extract name entiti biomed text describ stepwis solut tackl challeng entiti boundari detect entiti type classif without reli handcraft rule heurist annot data noun phrase chunker follow filter base invers document frequenc extract candid entiti free text classif candid entiti categori interest carri leverag principl distribut semant experi show system especi entiti classif step yield competit result two popular biomed dataset clinic note biolog literatur outperform baselin dictionari match approach detail error analysi provid road map futur work

Resumos Similares

J Am Med Inform Assoc - Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. ( 0,867439295382873 )
J Biomed Inform - UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text. ( 0,846991948877413 )
J Biomed Inform - Desiderata for ontologies to be used in semantic annotation of biomedical documents. ( 0,84300114113891 )
AMIA Annu Symp Proc - Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to process medication information in outpatient clinical notes. ( 0,841671512036939 )
AMIA Annu Symp Proc - A cloud-based approach to medical NLP. ( 0,82861748857488 )
Artif Intell Med - Terminological resources for text mining over biomedical scientific literature. ( 0,828370220291136 )
AMIA Annu Symp Proc - Automatic acquisition of sublanguage semantic schema: towards the word sense disambiguation of clinical narratives. ( 0,828241697835249 )
J Am Med Inform Assoc - Towards comprehensive syntactic and semantic annotations of the clinical narrative. ( 0,814007938974914 )
AMIA Annu Symp Proc - Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method. ( 0,813768369530705 )
Appl Clin Inform - Representation of information about family relatives as structured data in electronic health records. ( 0,81283714898412 )
J Biomed Inform - Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. ( 0,806556817031426 )
AMIA Annu Symp Proc - An evaluation of the UMLS in representing corpus derived clinical concepts. ( 0,796769946638696 )
Artif Intell Med - A semantic graph-based approach to biomedical summarisation. ( 0,792113756969614 )
J Am Med Inform Assoc - Anaphoric relations in the clinical narrative: corpus creation. ( 0,791199991151053 )
J Biomed Inform - Ontology modularization to improve semantic medical image annotation. ( 0,789016111768168 )
AMIA Annu Symp Proc - Inferring the semantic relationships of words within an ontology using random indexing: applications to pharmacogenomics. ( 0,780313399247933 )
AMIA Annu Symp Proc - Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining. ( 0,779971475197628 )
J Biomed Inform - Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. ( 0,777653293387287 )
AMIA Annu Symp Proc - Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature. ( 0,773773983251925 )
J Biomed Inform - Anaphoric reference in clinical reports: characteristics of an annotated corpus. ( 0,770120381198714 )
AMIA Annu Symp Proc - Towards a semantic lexicon for clinical natural language processing. ( 0,76527673145133 )
AMIA Annu Symp Proc - Inter-annotator reliability of medical events, coreferences and temporal relations in clinical narratives by annotators with varying levels of clinical expertise. ( 0,764885246088318 )
J Biomed Inform - Common data model for natural language processing based on two existing standard information models: CDA+GrAF. ( 0,761853294560401 )
AMIA Annu Symp Proc - Extracting Concepts Related to Homelessness from the Free Text of VA Electronic Medical Records. ( 0,755879226256905 )
J Biomed Inform - Semantator: semantic annotator for converting biomedical text to linked data. ( 0,750458310021205 )
Brief. Bioinformatics - A survey on annotation tools for the biomedical literature. ( 0,747010240866739 )
AMIA Annu Symp Proc - Using ontology network structure in text mining. ( 0,746408821123782 )
J Am Med Inform Assoc - Eventual situations for timeline extraction from clinical reports. ( 0,743678603670877 )
AMIA Annu Symp Proc - Natural language processing to extract follow-up provider information from hospital discharge summaries. ( 0,74301827398496 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,73928965785058 )
J Am Med Inform Assoc - Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text. ( 0,738514646897958 )
J Am Med Inform Assoc - An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge. ( 0,738346519002025 )
J Biomed Inform - Extraction of events and temporal expressions from clinical narratives. ( 0,736736736736737 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,735663653208058 )
J Biomed Inform - A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora. ( 0,734658284832183 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,732622815466222 )
J Am Med Inform Assoc - A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. ( 0,731364536083842 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,729664535461961 )
J Biomed Inform - MedTime: a temporal information extraction system for clinical narratives. ( 0,727749618445518 )
AMIA Annu Symp Proc - Mining Biomedical Literature for Terms related to Epidemiologic Exposures. ( 0,727611567975375 )
J Integr Bioinform - Automatic extraction of microorganisms and their habitats from free text using text mining workflows. ( 0,726577920999459 )
J Med Syst - Redactable signatures for signed CDA Documents. ( 0,72588196966968 )
J Am Med Inform Assoc - A hybrid system for temporal information extraction from clinical text. ( 0,725807908051237 )
AMIA Annu Symp Proc - Building gold standard corpora for medical natural language processing tasks. ( 0,724901019627125 )
Artif Intell Med - Approaching the axiomatic enrichment of the Gene Ontology from a lexical perspective. ( 0,724806654582233 )
J Biomed Inform - NCBI disease corpus: a resource for disease name recognition and concept normalization. ( 0,724647039576451 )
J Biomed Inform - Coreference resolution: a review of general methodologies and applications in the clinical domain. ( 0,7237490598201 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,723136014125763 )
J Am Med Inform Assoc - Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. ( 0,722653486149952 )
Artif Intell Med - Biomedical events extraction using the hidden vector state model. ( 0,721757048880646 )
J Biomed Inform - Towards generating a patient's timeline: extracting temporal relationships from clinical notes. ( 0,719448273002045 )
J Biomed Inform - Semantic similarity estimation in the biomedical domain: an ontology-based information-theoretic perspective. ( 0,718593058571376 )
AMIA Annu Symp Proc - Semantic processing to identify adverse drug event information from black box warnings. ( 0,715186144266428 )
J Biomed Inform - Lexical patterns, features and knowledge resources for coreference resolution in clinical notes. ( 0,714143742506833 )
J Biomed Inform - An analysis of FMA using structural self-bisimilarity. ( 0,71390102697877 )
J Am Med Inform Assoc - Using rule-based natural language processing to improve disease normalization in biomedical text. ( 0,713509914426763 )
J Am Med Inform Assoc - Automated clinical trial eligibility prescreening: increasing the efficiency of patient identification for clinical trials in the emergency department. ( 0,71196962327743 )
AMIA Annu Symp Proc - Critical finding capture in the impression section of radiology reports. ( 0,711694089466056 )
J Am Med Inform Assoc - MedXN: an open source medication extraction and normalization tool for clinical text. ( 0,711666692007087 )
J Biomed Inform - Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. ( 0,707469355609662 )
J Biomed Inform - Cross-domain targeted ontology subsets for annotation: the case of SNOMED CORE and RxNorm. ( 0,706021207627942 )
AMIA Annu Symp Proc - A literature-based assessment of concept pairs as a measure of semantic relatedness. ( 0,704426821877147 )
BMC Med Inform Decis Mak - Mining biomarker information in biomedical literature. ( 0,703040499350646 )
Int J Med Inform - Detecting temporal expressions in medical narratives. ( 0,701689556916086 )
J Biomed Inform - Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies. ( 0,701689507077516 )
AMIA Annu Symp Proc - Developing a section labeler for clinical documents. ( 0,701157450249387 )
J Biomed Inform - Ontology-guided feature engineering for clinical text classification. ( 0,700669433152644 )
J Am Med Inform Assoc - Extracting drug indication information from structured product labels using natural language processing. ( 0,700350626590092 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,699182848595063 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,698798350631149 )
J Am Med Inform Assoc - Validating a strategy for psychosocial phenotyping using a large corpus of clinical text. ( 0,698769729125263 )
J Am Med Inform Assoc - Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries. ( 0,698668984509475 )
J Am Med Inform Assoc - Machine learning-based coreference resolution of concepts in clinical documents. ( 0,69866497560186 )
J Biomed Inform - Approaches to verb subcategorization for biomedicine. ( 0,696697212521301 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,696510563679503 )
J Am Med Inform Assoc - Temporal reasoning over clinical text: the state of the art. ( 0,696062735522932 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,696026965492245 )
J. Med. Internet Res. - Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. ( 0,696007793437021 )
J Biomed Inform - A graph-based recovery and decomposition of Swanson's hypothesis using semantic predications. ( 0,695935993862018 )
J Am Med Inform Assoc - Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. ( 0,693316577067586 )
J Biomed Inform - Mapping Partners Master Drug Dictionary to RxNorm using an NLP-based approach. ( 0,692919605963361 )
J Biomed Inform - Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. ( 0,692254181249305 )
AMIA Annu Symp Proc - Automatically pairing measured findings across narrative abdomen CT reports. ( 0,691265296918935 )
J Biomed Inform - Using text to build semantic networks for pharmacogenomics. ( 0,690867925542591 )
AMIA Annu Symp Proc - EpiDEA: extracting structured epilepsy and seizure information from patient discharge summaries for cohort identification. ( 0,689574299909769 )
AMIA Annu Symp Proc - Mapping annotations with textual evidence using an scLDA model. ( 0,688763297552401 )
J Am Med Inform Assoc - Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. ( 0,688402097277027 )
AMIA Annu Symp Proc - A Knowledge Intensive Approach to Mapping Clinical Narrative to LOINC. ( 0,687662277282612 )
J Biomed Inform - Text summarization in the biomedical domain: a systematic review of recent research. ( 0,685916138856465 )
AMIA Annu Symp Proc - Natural language processing for lines and devices in portable chest x-rays. ( 0,685538427740475 )
J Am Med Inform Assoc - Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. ( 0,685230779205821 )
Int J Med Inform - Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. ( 0,684842967689337 )
Int J Med Inform - A methodology to enhance spatial understanding of disease outbreak events reported in news articles. ( 0,681533431871963 )
AMIA Annu Symp Proc - A high throughput semantic concept frequency based approach for patient identification: a case study using type 2 diabetes mellitus clinical notes. ( 0,679843862864079 )
J Biomed Inform - Comparing different knowledge sources for the automatic summarization of biomedical literature. ( 0,679755836412376 )
Comput. Biol. Med. - A P300-based brain computer interface system for words typing. ( 0,679692107653287 )
J Biomed Inform - Secondary use of electronic health records for building cohort studies through top-down information extraction. ( 0,677436333214664 )
Sci Data - Building the graph of medicine from millions of clinical narratives. ( 0,675272684387537 )
AMIA Annu Symp Proc - Extracting patient demographics and personal medical information from online health forums. ( 0,672774425095853 )
Comput Math Methods Med - Ranking biomedical annotations with annotator's semantic relevancy. ( 0,672237530505029 )