J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ featur(3375) classif(2383) classifi(1994) }
{ method(2212) result(1239) propos(1039) }
{ extract(1171) text(1153) clinic(932) }
{ model(2341) predict(2261) use(1141) }
{ cost(1906) reduc(1198) effect(832) }
{ analysi(2126) use(1163) compon(1037) }
{ network(2748) neural(1063) input(814) }
{ group(2977) signific(1463) compar(1072) }
{ implement(1333) system(1263) develop(1122) }
{ estim(2440) model(1874) function(577) }
{ can(774) often(719) complex(702) }
{ system(1976) rule(880) can(841) }
{ use(1733) differ(960) four(931) }
{ method(1219) similar(1157) match(930) }
{ patient(2315) diseas(1263) diabet(1191) }
{ care(1570) inform(1187) nurs(1089) }
{ howev(809) still(633) remain(590) }
{ perform(999) metric(946) measur(919) }
{ method(1969) cluster(1462) data(1082) }
{ model(3404) distribut(989) bayesian(671) }
{ measur(2081) correl(1212) valu(896) }
{ problem(2511) optim(1539) algorithm(950) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ featur(1941) imag(1645) propos(1176) }
{ data(3963) clinic(1234) research(1004) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ medic(1828) order(1363) alert(1069) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ patient(1821) servic(1111) care(1106) }
{ structur(1116) can(940) graph(676) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ detect(2391) sensit(1101) algorithm(908) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ signal(2180) analysi(812) frequenc(800) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ process(1125) use(805) approach(778) }

Resumo

JECTIVES: This study was to assess whether active learning strategies can be integrated with supervised word sense disambiguation (WSD) methods, thus reducing the number of annotated samples, while keeping or improving the quality of disambiguation models.METHODS: We developed support vector machine (SVM) classifiers to disambiguate 197 ambiguous terms and abbreviations in the MSH WSD collection. Three different uncertainty sampling-based active learning algorithms were implemented with the SVM classifiers and were compared with a passive learner (PL) based on random sampling. For each ambiguous term and each learning algorithm, a learning curve that plots the accuracy computed from the test set as a function of the number of annotated samples used in the model was generated. The area under the learning curve (ALC) was used as the primary metric for evaluation.RESULTS: Our experiments demonstrated that active learners (ALs) significantly outperformed the PL, showing better performance for 177 out of 197 (89.8%) WSD tasks. Further analysis showed that to achieve an average accuracy of 90%, the PL needed 38 annotated samples, while the ALs needed only 24, a 37% reduction in annotation effort. Moreover, we analyzed cases where active learning algorithms did not achieve superior performance and identified three causes: (1) poor models in the early learning stage; (2) easy WSD cases; and (3) difficult WSD cases, which provide useful insight for future improvements.CONCLUSIONS: This study demonstrated that integrating active learning strategies with supervised WSD methods could effectively reduce annotation cost and improve the disambiguation models.

Resumo Limpo

jectiv studi assess whether activ learn strategi can integr supervis word sens disambigu wsd method thus reduc number annot sampl keep improv qualiti disambigu modelsmethod develop support vector machin svm classifi disambigu ambigu term abbrevi msh wsd collect three differ uncertainti samplingbas activ learn algorithm implement svm classifi compar passiv learner pl base random sampl ambigu term learn algorithm learn curv plot accuraci comput test set function number annot sampl use model generat area learn curv alc use primari metric evaluationresult experi demonstr activ learner al signific outperform pl show better perform wsd task analysi show achiev averag accuraci pl need annot sampl al need reduct annot effort moreov analyz case activ learn algorithm achiev superior perform identifi three caus poor model earli learn stage easi wsd case difficult wsd case provid use insight futur improvementsconclus studi demonstr integr activ learn strategi supervis wsd method effect reduc annot cost improv disambigu model

Resumos Similares

J Am Med Inform Assoc - A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction. ( 0,826077337667059 )
IEEE J Biomed Health Inform - Multiple kernel learning in the primal for multimodal Alzheimer's disease classification. ( 0,762639598970793 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,736314364634627 )
J Chem Inf Model - Classifying large chemical data sets: using a regularized potential function method. ( 0,716316568135543 )
J Biomed Inform - Classifying temporal relations in clinical data: a hybrid, knowledge-rich approach. ( 0,706894946392212 )
J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. ( 0,699074272118114 )
Comput Methods Programs Biomed - Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis. ( 0,698936053524415 )
Artif Intell Med - Improving the Mann-Whitney statistical test for feature selection: an approach in breast cancer diagnosis on mammography. ( 0,696703553751771 )
Neural Comput - Adaptive metric learning vector quantization for ordinal classification. ( 0,696111857970261 )
AMIA Annu Symp Proc - Improving predictions in imbalanced data using Pairwise Expanded Logistic Regression. ( 0,694052624967049 )
Neural Comput - Divergence-based vector quantization. ( 0,691909511400314 )
Comput Methods Programs Biomed - Multistage approach for clustering and classification of ECG data. ( 0,691550858482269 )
J Med Syst - 3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification. ( 0,690532069551425 )
BMC Med Inform Decis Mak - Predicting disease risks from highly imbalanced data using random forest. ( 0,677420421860584 )
Neural Comput - Extended robust support vector machine based on financial risk minimization. ( 0,67634756868645 )
IEEE Trans Neural Netw Learn Syst - ML-Tree: a tree-structure-based approach to multilabel learning. ( 0,674025708824383 )
Artif Intell Med - Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. ( 0,670659152974926 )
J Med Syst - A software framework for building biomedical machine learning classifiers through grid computing resources. ( 0,667505953305652 )
Int J Neural Syst - Structurally enhanced incremental neural learning for image classification with subgraph extraction. ( 0,666116223790832 )
Int J Neural Syst - Aggregation of sparse linear discriminant analyses for event-related potential classification in brain-computer interface. ( 0,665918768842769 )
Int J Neural Syst - Span: spike pattern association neuron for learning spatio-temporal spike patterns. ( 0,658807144922238 )
IEEE Trans Neural Netw Learn Syst - A Kernel Classification Framework for Metric Learning. ( 0,658547065490374 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data. ( 0,657444808989259 )
Comput. Biol. Med. - Relabeling algorithm for retrieval of noisy instances and improving prediction quality. ( 0,656243907436135 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,653796593859479 )
J Integr Bioinform - Modelling proteolytic enzymes with Support Vector Machines. ( 0,651691689762019 )
Comput Methods Programs Biomed - Machine learning algorithms and forced oscillation measurements applied to the automatic identification of chronic obstructive pulmonary disease. ( 0,650360346819407 )
Comput Biol Chem - CE-PLoc: an ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition. ( 0,646891312278271 )
IEEE J Biomed Health Inform - Automatic detection of atrial fibrillation in cardiac vibration signals. ( 0,646696451475663 )
IEEE Trans Image Process - A linear support higher-order tensor machine for classification. ( 0,642145124954557 )
IEEE Trans Image Process - Multiple-kernel, multiple-instance similarity features for efficient visual object detection. ( 0,638002572149649 )
IEEE Trans Image Process - Improving Web image search by bag-based reranking. ( 0,637195381959281 )
J Am Med Inform Assoc - Learning classification models with soft-label information. ( 0,637099674703773 )
IEEE Trans Image Process - Task-specific image partitioning. ( 0,636321046129071 )
Artif Intell Med - Multi-objective evolutionary algorithms for fuzzy classification in survival prediction. ( 0,633702840943289 )
Comput Methods Programs Biomed - Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain-computer interface. ( 0,632721463624453 )
J Biomed Inform - Determining the difficulty of Word Sense Disambiguation. ( 0,632612694960633 )
J Biomed Inform - Portable automatic text classification for adverse drug reaction detection via multi-corpus training. ( 0,627214838879433 )
Neural Comput - Feature selection for ordinal text classification. ( 0,626220734124931 )
Neural Comput - Online learning with (multiple) kernels: a review. ( 0,625395030536926 )
IEEE Trans Pattern Anal Mach Intell - Learning Hierarchical Features for Scene Labeling. ( 0,624956381337421 )
Comput. Biol. Med. - Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors. ( 0,623895942330389 )
BMC Med Inform Decis Mak - Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features. ( 0,622249866870776 )
IEEE Trans Image Process - Saliency and gist features for target detection in satellite images. ( 0,621937020306873 )
Comput Math Methods Med - On multilabel classification methods of incompletely labeled biomedical text data. ( 0,621849288275822 )
IEEE Trans Image Process - Design of non-linear kernel dictionaries for object recognition. ( 0,621210710465883 )
J Am Med Inform Assoc - Practical implementation of an existing smoking detection pipeline and reduced support vector machine training corpus requirements. ( 0,620173672946042 )
Neural Comput - Unsupervised learning of generative and discriminative weights encoding elementary image components in a predictive coding model of cortical function. ( 0,619954393164155 )
J Am Med Inform Assoc - Predicting complications of percutaneous coronary intervention using a novel support vector method. ( 0,618771665416626 )
AMIA Annu Symp Proc - Outlier Detection with One-Class SVMs: An Application to Melanoma Prognosis. ( 0,617894500612604 )
IEEE Trans Image Process - Artistic image analysis using graph-based learning approaches. ( 0,616865631127135 )
J Biomed Inform - Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing. ( 0,61676793173036 )
J Biomed Inform - Applying active learning to assertion classification of concepts in clinical text. ( 0,6151334394447 )
Artif Intell Med - A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. ( 0,615057162583593 )
IEEE Trans Image Process - Geodesic propagation for semantic labeling. ( 0,614206326868054 )
IEEE Trans Neural Netw Learn Syst - Evolutionary fuzzy ARTMAP neural networks for classification of semiconductor defects. ( 0,612390317436503 )
J Biomed Inform - A medical diagnostic tool based on radial basis function classifiers and evolutionary simulated annealing. ( 0,611496193400061 )
Comput Math Methods Med - Multivoxel pattern analysis for FMRI data: a review. ( 0,609654026471459 )
J Chem Inf Model - Classifying molecules using a sparse probabilistic kernel binary classifier. ( 0,608934917275886 )
J Integr Bioinform - On the parameter optimization of Support Vector Machines for binary classification. ( 0,608906231668 )
Neural Comput - Metacognitive learning in a fully complex-valued radial basis function neural network. ( 0,608087343855571 )
J. Comput. Biol. - Imbalanced class learning in epigenetics. ( 0,607686686869961 )
IEEE Trans Image Process - Multiview Hessian regularization for image annotation. ( 0,605394102802042 )
J Biomed Inform - Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. ( 0,605267609940796 )
IEEE Trans Pattern Anal Mach Intell - A Bag-of-Features Framework to Classify Time Series. ( 0,604927756437725 )
J Med Syst - Diagnosis of several diseases by using combined kernels with Support Vector Machine. ( 0,604783273061789 )
Neural Comput - Reduction from cost-sensitive ordinal ranking to weighted binary classification. ( 0,604548605370544 )
Med Biol Eng Comput - Efficient automatic classifiers for the detection of A phases of the cyclic alternating pattern in sleep. ( 0,604218906815628 )
BMC Med Inform Decis Mak - Decision tree-based learning to predict patient controlled analgesia consumption and readjustment. ( 0,602595248816054 )
Comput Methods Programs Biomed - Design of fuzzy classifier for diabetes disease using Modified Artificial Bee Colony algorithm. ( 0,602187696270858 )
Med Decis Making - The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes. ( 0,601835811662591 )
IEEE J Biomed Health Inform - Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare. ( 0,601441072585263 )
J Am Med Inform Assoc - Missing values in deduplication of electronic patient data. ( 0,598745970977693 )
AMIA Annu Symp Proc - Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing. ( 0,597772689727332 )
IEEE J Biomed Health Inform - Rule extraction from support vector machines using ensemble learning approach: an application for diagnosis of diabetes. ( 0,596932665460745 )
Comput. Biol. Med. - Identification of epilepsy stages from ECoG using genetic programming classifiers. ( 0,595693270838037 )
IEEE Trans Image Process - A novel technique for subpixel image classification based on support vector machine. ( 0,595107892853244 )
IEEE Trans Image Process - Learning conditional random fields for classification of hyperspectral images. ( 0,595055304082462 )
J Am Med Inform Assoc - Machine learning-based coreference resolution of concepts in clinical documents. ( 0,593882192025349 )
AMIA Annu Symp Proc - Predicting discharge mortality after acute ischemic stroke using balanced data. ( 0,593654945440823 )
Med Biol Eng Comput - Classification of multichannel EEG patterns using parallel hidden Markov models. ( 0,59354071311778 )
J Biomed Inform - Learning classification models from multiple experts. ( 0,592584573255178 )
Comput Methods Programs Biomed - Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines. ( 0,591962747390873 )
Neural Comput - Incremental learning by message passing in hierarchical temporal memory. ( 0,591763277837509 )
J Med Syst - A new approach: role of data mining in prediction of survival of burn patients. ( 0,590512112832321 )
Artif Intell Med - Suppressed fuzzy-soft learning vector quantization for MRI segmentation. ( 0,590208161172035 )
Brief. Bioinformatics - Class-imbalanced classifiers for high-dimensional data. ( 0,589378040444765 )
J Biomed Inform - Class proximity measures--dissimilarity-based classification and display of high-dimensional data. ( 0,5868550165459 )
Comput. Biol. Med. - Medical decision support system for diagnosis of neuromuscular disorders using DWT and fuzzy support vector machines. ( 0,586769841724661 )
Comput Methods Programs Biomed - Hepatitis disease diagnosis using a novel hybrid method based on support vector machine and simulated annealing (SVM-SA). ( 0,586678310712431 )
J Med Syst - Classification of juvenile myoclonic epilepsy data acquired through scanning electromyography with machine learning algorithms. ( 0,586485143528251 )
J Chem Inf Model - Atom environment kernels on molecules. ( 0,586449824899964 )
IEEE Trans Neural Netw Learn Syst - Adaptive Batch Mode Active Learning. ( 0,586121498982283 )
J Am Med Inform Assoc - Supervised machine learning and active learning in classification of radiology reports. ( 0,585181949262477 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,584478750861029 )
J Biomed Inform - Dynamic categorization of clinical research eligibility criteria by hierarchical clustering. ( 0,58414083426069 )
Comput Math Methods Med - Pulse waveform classification using support vector machine with Gaussian time warp edit distance kernel. ( 0,583378959894712 )
AMIA Annu Symp Proc - Part-of-speech tagging for clinical text: wall or bridge between institutions? ( 0,583039325345156 )
BMC Med Inform Decis Mak - Learning to improve medical decision making from imbalanced data without a priori cost. ( 0,583000111021376 )
Neural Comput - Mismatched training and test distributions can outperform matched ones. ( 0,582890767471831 )