J Biomed Inform - Portable automatic text classification for adverse drug reaction detection via multi-corpus training.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ extract(1171) text(1153) clinic(932) }
{ health(1844) social(1437) communiti(874) }
{ data(2317) use(1299) case(1017) }
{ imag(2675) segment(2577) method(1081) }
{ studi(1410) differ(1259) use(1210) }
{ data(3008) multipl(1320) sourc(1022) }
{ perform(999) metric(946) measur(919) }
{ perform(1367) use(1326) method(1137) }
{ featur(3375) classif(2383) classifi(1994) }
{ concept(1167) ontolog(924) domain(897) }
{ method(984) reconstruct(947) comput(926) }
{ state(1844) use(1261) util(961) }
{ can(981) present(881) function(850) }
{ high(1669) rate(1365) level(1280) }
{ inform(2794) health(2639) internet(1427) }
{ method(1219) similar(1157) match(930) }
{ research(1085) discuss(1038) issu(1018) }
{ drug(1928) target(777) effect(648) }
{ import(1318) role(1303) understand(862) }
{ record(1888) medic(1808) patient(1693) }
{ cost(1906) reduc(1198) effect(832) }
{ sampl(1606) size(1419) use(1276) }
{ time(1939) patient(1703) rate(768) }
{ use(1733) differ(960) four(931) }
{ can(774) often(719) complex(702) }
{ measur(2081) correl(1212) valu(896) }
{ imag(2830) propos(1344) filter(1198) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ method(1557) propos(1049) approach(1037) }
{ model(2220) cell(1177) simul(1124) }
{ system(1050) medic(1026) inform(1018) }
{ model(2341) predict(2261) use(1141) }
{ spatial(1525) area(1432) region(1030) }
{ research(1218) medic(880) student(794) }
{ medic(1828) order(1363) alert(1069) }
{ group(2977) signific(1463) compar(1072) }
{ patient(1821) servic(1111) care(1106) }
{ implement(1333) system(1263) develop(1122) }
{ method(2212) result(1239) propos(1039) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ network(2748) neural(1063) input(814) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ signal(2180) analysi(812) frequenc(800) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

JECTIVE: Automatic detection of adverse drug reaction (ADR) mentions from text has recently received significant interest in pharmacovigilance research. Current research focuses on various sources of text-based information, including social media-where enormous amounts of user posted data is available, which have the potential for use in pharmacovigilance if collected and filtered accurately. The aims of this study are: (i) to explore natural language processing (NLP) approaches for generating useful features from text, and utilizing them in optimized machine learning algorithms for automatic classification of ADR assertive text segments; (ii) to present two data sets that we prepared for the task of ADR detection from user posted internet data; and (iii) to investigate if combining training data from distinct corpora can improve automatic classification accuracies.METHODS: One of our three data sets contains annotated sentences from clinical reports, and the two other data sets, built in-house, consist of annotated posts from social media. Our text classification approach relies on generating a large set of features, representing semantic properties (e.g., sentiment, polarity, and topic), from short text nuggets. Importantly, using our expanded feature sets, we combine training data from different corpora in attempts to boost classification accuracies.RESULTS: Our feature-rich classification approach performs significantly better than previously published approaches with ADR class F-scores of 0.812 (previously reported best: 0.770), 0.538 and 0.678 for the three data sets. Combining training data from multiple compatible corpora further improves the ADR F-scores for the in-house data sets to 0.597 (improvement of 5.9 units) and 0.704 (improvement of 2.6 units) respectively.CONCLUSIONS: Our research results indicate that using advanced NLP techniques for generating information rich features from text can significantly improve classification accuracies over existing benchmarks. Our experiments illustrate the benefits of incorporating various semantic features such as topics, concepts, sentiments, and polarities. Finally, we show that integration of information from compatible corpora can significantly improve classification performance. This form of multi-corpus training may be particularly useful in cases where data sets are heavily imbalanced (e.g., social media data), and may reduce the time and costs associated with the annotation of data in the future.

Resumo Limpo

jectiv automat detect advers drug reaction adr mention text recent receiv signific interest pharmacovigil research current research focus various sourc textbas inform includ social mediawher enorm amount user post data avail potenti use pharmacovigil collect filter accur aim studi explor natur languag process nlp approach generat use featur text util optim machin learn algorithm automat classif adr assert text segment ii present two data set prepar task adr detect user post internet data iii investig combin train data distinct corpora can improv automat classif accuraciesmethod one three data set contain annot sentenc clinic report two data set built inhous consist annot post social media text classif approach reli generat larg set featur repres semant properti eg sentiment polar topic short text nugget import use expand featur set combin train data differ corpora attempt boost classif accuraciesresult featurerich classif approach perform signific better previous publish approach adr class fscore previous report best three data set combin train data multipl compat corpora improv adr fscore inhous data set improv unit improv unit respectivelyconclus research result indic use advanc nlp techniqu generat inform rich featur text can signific improv classif accuraci exist benchmark experi illustr benefit incorpor various semant featur topic concept sentiment polar final show integr inform compat corpora can signific improv classif perform form multicorpus train may particular use case data set heavili imbalanc eg social media data may reduc time cost associ annot data futur

Resumos Similares

IEEE Trans Image Process - Geodesic propagation for semantic labeling. ( 0,796892137262225 )
Comput Math Methods Med - On multilabel classification methods of incompletely labeled biomedical text data. ( 0,792976434525992 )
J Biomed Inform - Applying active learning to assertion classification of concepts in clinical text. ( 0,778272422412867 )
J Biomed Inform - Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. ( 0,771751169101264 )
Comput Methods Programs Biomed - Multistage approach for clustering and classification of ECG data. ( 0,765540245626529 )
IEEE Trans Neural Netw Learn Syst - A Kernel Classification Framework for Metric Learning. ( 0,761039585969019 )
IEEE Trans Pattern Anal Mach Intell - Distance-Based Image Classification: Generalizing to New Classes at Near Zero Cost. ( 0,759047116757743 )
J. Comput. Biol. - Imbalanced class learning in epigenetics. ( 0,756494508002641 )
J Med Syst - 3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification. ( 0,7560054362972 )
IEEE Trans Neural Netw Learn Syst - Adaptive Batch Mode Active Learning. ( 0,753914144924074 )
J Am Med Inform Assoc - Learning classification models with soft-label information. ( 0,753871728112107 )
Int J Neural Syst - Aggregation of sparse linear discriminant analyses for event-related potential classification in brain-computer interface. ( 0,748853429838131 )
J Biomed Inform - Class proximity measures--dissimilarity-based classification and display of high-dimensional data. ( 0,747021037196453 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,744613824070122 )
Int J Neural Syst - Structurally enhanced incremental neural learning for image classification with subgraph extraction. ( 0,743195014306641 )
Neural Comput - Reduction from cost-sensitive ordinal ranking to weighted binary classification. ( 0,743038372808283 )
Int J Neural Syst - Online semi-supervised growing neural gas. ( 0,740962051730749 )
AMIA Annu Symp Proc - Part-of-speech tagging for clinical text: wall or bridge between institutions? ( 0,734242231277361 )
IEEE Trans Image Process - Manifold regularized multitask learning for semi-supervised multilabel image classification. ( 0,73408223709374 )
Neural Comput - Divergence-based vector quantization. ( 0,732249193899591 )
IEEE Trans Image Process - Joint segmentation of images and scanned point cloud in large-scale street scenes with low-annotation cost. ( 0,729214541009534 )
IEEE Trans Image Process - A linear support higher-order tensor machine for classification. ( 0,729086105368956 )
Neural Comput - Computing sparse representations of multidimensional signals using Kronecker bases. ( 0,728801826107049 )
J Biomed Inform - Classifying temporal relations in clinical data: a hybrid, knowledge-rich approach. ( 0,727499264833308 )
Neural Comput - Adaptive metric learning vector quantization for ordinal classification. ( 0,719654842284497 )
Int J Neural Syst - Span: spike pattern association neuron for learning spatio-temporal spike patterns. ( 0,719174699280991 )
IEEE Trans Image Process - Task-specific image partitioning. ( 0,716432163506508 )
IEEE Trans Image Process - Improving Web image search by bag-based reranking. ( 0,715566377902019 )
J Am Med Inform Assoc - Active learning for clinical text classification: is it better than random sampling? ( 0,712933208105349 )
J Biomed Inform - Learning classification models from multiple experts. ( 0,711679848338452 )
IEEE J Biomed Health Inform - Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare. ( 0,709454189069523 )
Comput. Biol. Med. - Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors. ( 0,708594045656965 )
AMIA Annu Symp Proc - Comparison and combination of several MeSH indexing approaches. ( 0,707475825016005 )
J Biomed Inform - Incremental Gaussian Discriminant Analysis based on Graybill and Deal weighted combination of estimators for brain tumour diagnosis. ( 0,706664662911556 )
AMIA Annu Symp Proc - Classification of medication status change in clinical narratives. ( 0,704911840535304 )
IEEE Trans Image Process - Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion. ( 0,703731611994724 )
IEEE Trans Pattern Anal Mach Intell - Weakly Supervised Recognition of Daily Life Activities with Wearable Sensors. ( 0,701934363319874 )
IEEE Trans Pattern Anal Mach Intell - Learning Categories from Few Examples with Multi Model Knowledge Transfer. ( 0,701310323271625 )
Int J Neural Syst - Linear time relational prototype based learning. ( 0,700503771451113 )
J Am Med Inform Assoc - Evaluating the utility of syndromic surveillance algorithms for screening to detect potentially clonal hospital infection outbreaks. ( 0,700369323299118 )
J Biomed Inform - Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms. ( 0,699442896165772 )
J Biomed Inform - Temporal relation discovery between events and temporal expressions identified in clinical narrative. ( 0,699166711952088 )
IEEE Trans Image Process - Structured max-margin learning for inter-related classifier training and multilabel image annotation. ( 0,696296152173865 )
Neural Comput - Metacognitive learning in a fully complex-valued radial basis function neural network. ( 0,691934556837315 )
IEEE Trans Image Process - Hyperspectral image classification through bilayer graph-based learning. ( 0,691109217861119 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,68648314273542 )
Neural Comput - Online learning with (multiple) kernels: a review. ( 0,686202009978289 )
J Chem Inf Model - Atom environment kernels on molecules. ( 0,681694591823813 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data. ( 0,680436499260738 )
IEEE Trans Image Process - Unsupervised amplitude and texture classification of SAR images with multinomial latent model. ( 0,680194504101514 )
IEEE Trans Image Process - Saliency and gist features for target detection in satellite images. ( 0,678572369361191 )
IEEE Trans Image Process - Multiview Hessian regularization for image annotation. ( 0,677040947584584 )
J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. ( 0,676447451692442 )
IEEE Trans Neural Netw Learn Syst - ML-Tree: a tree-structure-based approach to multilabel learning. ( 0,674024406065376 )
Neural Comput - Multiple spectral kernel learning and a gaussian complexity computation. ( 0,666024858364612 )
BMC Med Inform Decis Mak - Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features. ( 0,663732695295231 )
IEEE Trans Image Process - A Probabilistic Associative Model for Segmenting Weakly-Supervised Images. ( 0,663132575317568 )
Neural Comput - Representing objects, relations, and sequences. ( 0,663047542415915 )
IEEE Trans Image Process - Enhancing training collections for image annotation: an instance-weighted mixture modeling approach. ( 0,662716546608056 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,659439882394321 )
Artif Intell Med - Exploiting the systematic review protocol for classification of medical abstracts. ( 0,658997570801582 )
Artif Intell Med - Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. ( 0,657429061974309 )
IEEE Trans Image Process - Multiple-kernel, multiple-instance similarity features for efficient visual object detection. ( 0,656431279793871 )
IEEE Trans Image Process - Self-supervised online metric learning with low rank constraint for scene categorization. ( 0,656368608087364 )
IEEE Trans Neural Netw Learn Syst - An efficient topological distance-based tree kernel. ( 0,656235544700447 )
IEEE Trans Image Process - Artistic image analysis using graph-based learning approaches. ( 0,656196614747238 )
IEEE Trans Pattern Anal Mach Intell - The Effect of Model Misspecification on Semi-Supervised Classification. ( 0,654537151115941 )
J Biomed Inform - Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing. ( 0,654313014381314 )
IEEE Trans Image Process - Incremental training of a detector using online sparse eigendecomposition. ( 0,653704016428327 )
J Am Med Inform Assoc - Supervised machine learning and active learning in classification of radiology reports. ( 0,652760558761666 )
J Am Med Inform Assoc - Using machine learning for concept extraction on clinical documents from multiple data sources. ( 0,652356110352808 )
AMIA Annu Symp Proc - Hyperdimensional computing approach to word sense disambiguation. ( 0,65046062384716 )
Neural Comput - Unsupervised learning of generative and discriminative weights encoding elementary image components in a predictive coding model of cortical function. ( 0,648607946219988 )
J. Comput. Biol. - Locally learning biomedical data using diffusion frames. ( 0,64764116606329 )
Int J Comput Assist Radiol Surg - Statistical shape model of a liver for autopsy imaging. ( 0,646572053932224 )
IEEE Trans Pattern Anal Mach Intell - Latent Dirichlet Allocation Models for Image Classification. ( 0,646228705335129 )
J Chem Inf Model - Note on naive Bayes based on binary descriptors in cheminformatics. ( 0,646157898360959 )
AMIA Annu Symp Proc - Automatically Detecting Acute Myocardial Infarction Events from EHR Text: A Preliminary Study. ( 0,646078268543332 )
IEEE Trans Neural Netw Learn Syst - Kernel association for classification and prediction: a survey. ( 0,645891425217522 )
IEEE Trans Pattern Anal Mach Intell - Label Consistent K-SVD: Learning A Discriminative Dictionary for Recognition. ( 0,643786754562792 )
IEEE Trans Neural Netw Learn Syst - Partially shared latent factor learning with multiview data. ( 0,64221813968295 )
BMC Med Inform Decis Mak - Learning to improve medical decision making from imbalanced data without a priori cost. ( 0,64150061236894 )
IEEE Trans Pattern Anal Mach Intell - Facial Age Estimation by Learning from Label Distributions. ( 0,64078041101308 )
IEEE Trans Image Process - Fast semantic diffusion for large-scale context-based image and video annotation. ( 0,637388424233959 )
J Am Med Inform Assoc - A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction. ( 0,636301982516751 )
IEEE Trans Image Process - Learning discriminative dictionary for group sparse representation. ( 0,635798766764144 )
J Biomed Inform - Reducing systematic review workload through certainty-based screening. ( 0,628341831127486 )
IEEE Trans Pattern Anal Mach Intell - Representation Learning: A Review and New Perspectives. ( 0,628185924132921 )
J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE. ( 0,627214838879433 )
IEEE Trans Image Process - Cross-Device Automated Prostate Cancer Localization With Multiparametric MRI. ( 0,626915020681065 )
IEEE Trans Pattern Anal Mach Intell - A Bag-of-Features Framework to Classify Time Series. ( 0,626780717648124 )
AMIA Annu Symp Proc - Learning medical diagnosis models from multiple experts. ( 0,625538006027487 )
AMIA Annu Symp Proc - Sample-efficient learning with auxiliary class-label information. ( 0,625349785156494 )
IEEE Trans Pattern Anal Mach Intell - Good Practice in Large-Scale Learning for Image Classification. ( 0,623596348097424 )
IEEE Trans Image Process - Design of non-linear kernel dictionaries for object recognition. ( 0,623037831934166 )
J Biomed Inform - Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study. ( 0,622905261197402 )
IEEE Trans Image Process - Supervised ordering in IRp: application to morphological processing of hyperspectral images. ( 0,622331203043938 )
J Chem Inf Model - Classifying large chemical data sets: using a regularized potential function method. ( 0,622094917501315 )
Comput Methods Programs Biomed - Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain-computer interface. ( 0,620502414482107 )
Artif Intell Med - A classifier ensemble approach for the missing feature problem. ( 0,618029011593268 )