J Biomed Inform - Applying active learning to assertion classification of concepts in clinical text.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ perform(1367) use(1326) method(1137) }
{ extract(1171) text(1153) clinic(932) }
{ model(2341) predict(2261) use(1141) }
{ algorithm(1844) comput(1787) effici(935) }
{ compound(1573) activ(1297) structur(1058) }
{ assess(1506) score(1403) qualiti(1306) }
{ framework(1458) process(801) describ(734) }
{ howev(809) still(633) remain(590) }
{ general(901) number(790) one(736) }
{ implement(1333) system(1263) develop(1122) }
{ decis(3086) make(1611) patient(1517) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ studi(2440) review(1878) systemat(933) }
{ research(1085) discuss(1038) issu(1018) }
{ data(2317) use(1299) case(1017) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ take(945) account(800) differ(722) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ concept(1167) ontolog(924) domain(897) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ age(1611) year(1155) adult(843) }
{ can(981) present(881) function(850) }
{ process(1125) use(805) approach(778) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ clinic(1479) use(1117) guidelin(835) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC-0.7715) than the passive learning method (random sampling) (ALC-0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort.

Resumo Limpo

supervis machin learn method clinic natur languag process nlp research requir larg number annot sampl expens build involv physician activ learn approach activ sampl larg pool provid altern solut major goal classif reduc annot effort maintain qualiti predict model howev studi investig use clinic nlp paper report applic activ learn clinic text classif task determin assert status clinic concept annot corpus assert classif task ibva clinic nlp challeng use studi implement sever exist newli develop activ learn algorithm assess use outcom report global alc score base area averag learn curv auc area curv score result show number annot sampl use activ learn strategi generat better classif model best alc passiv learn method random sampl alc moreov achiev classif perform activ learn strategi requir fewer sampl random sampl method exampl achiev auc random sampl method use sampl best activ learn algorithm requir sampl reduct manual annot effort

Resumos Similares

IEEE Trans Neural Netw Learn Syst - Adaptive Batch Mode Active Learning. ( 0,803819149067978 )
J Biomed Inform - Portable automatic text classification for adverse drug reaction detection via multi-corpus training. ( 0,778272422412867 )
IEEE Trans Image Process - Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion. ( 0,764559471349917 )
J Am Med Inform Assoc - Active learning for clinical text classification: is it better than random sampling? ( 0,763946046255584 )
IEEE Trans Image Process - Geodesic propagation for semantic labeling. ( 0,757291798977872 )
Artif Intell Med - Exploiting the systematic review protocol for classification of medical abstracts. ( 0,749939758340318 )
Comput Math Methods Med - On multilabel classification methods of incompletely labeled biomedical text data. ( 0,741245433876796 )
J Biomed Inform - Class proximity measures--dissimilarity-based classification and display of high-dimensional data. ( 0,738706333468286 )
J Am Med Inform Assoc - Learning classification models with soft-label information. ( 0,738152621020737 )
IEEE Trans Pattern Anal Mach Intell - Distance-Based Image Classification: Generalizing to New Classes at Near Zero Cost. ( 0,737793490334534 )
J Chem Inf Model - Atom environment kernels on molecules. ( 0,736231478493931 )
Int J Neural Syst - Structurally enhanced incremental neural learning for image classification with subgraph extraction. ( 0,73327052422368 )
Med Decis Making - The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes. ( 0,724304358170979 )
Neural Comput - Reduction from cost-sensitive ordinal ranking to weighted binary classification. ( 0,723367518907697 )
J. Comput. Biol. - Imbalanced class learning in epigenetics. ( 0,718688996965358 )
J Biomed Inform - Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. ( 0,716188582460879 )
Comput. Biol. Med. - Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors. ( 0,715612634511042 )
Neural Comput - Adaptive metric learning vector quantization for ordinal classification. ( 0,713679510834793 )
J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. ( 0,712043935199871 )
J Med Syst - 3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification. ( 0,709198968229085 )
IEEE Trans Image Process - A linear support higher-order tensor machine for classification. ( 0,707001594318682 )
J Am Med Inform Assoc - Evaluating the utility of syndromic surveillance algorithms for screening to detect potentially clonal hospital infection outbreaks. ( 0,701997436727283 )
AMIA Annu Symp Proc - Part-of-speech tagging for clinical text: wall or bridge between institutions? ( 0,700924566076423 )
Neural Comput - Computing sparse representations of multidimensional signals using Kronecker bases. ( 0,698188578861947 )
Int J Neural Syst - Span: spike pattern association neuron for learning spatio-temporal spike patterns. ( 0,696785387930309 )
IEEE Trans Neural Netw Learn Syst - A Kernel Classification Framework for Metric Learning. ( 0,693509011822878 )
IEEE J Biomed Health Inform - Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare. ( 0,693399975671487 )
IEEE Trans Image Process - Joint segmentation of images and scanned point cloud in large-scale street scenes with low-annotation cost. ( 0,692644638007357 )
AMIA Annu Symp Proc - Outlier Detection with One-Class SVMs: An Application to Melanoma Prognosis. ( 0,692251850532522 )
Neural Comput - Metacognitive learning in a fully complex-valued radial basis function neural network. ( 0,691098417744702 )
IEEE Trans Image Process - Manifold regularized multitask learning for semi-supervised multilabel image classification. ( 0,689815690768608 )
J Am Med Inform Assoc - Supervised embedding of textual predictors with applications in clinical diagnostics for pediatric cardiology. ( 0,689063233848168 )
Comput Methods Programs Biomed - Multistage approach for clustering and classification of ECG data. ( 0,686813532222361 )
Artif Intell Med - A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. ( 0,686608595653432 )
IEEE Trans Image Process - Unsupervised amplitude and texture classification of SAR images with multinomial latent model. ( 0,684139342991924 )
Neural Comput - Divergence-based vector quantization. ( 0,684121246532002 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data. ( 0,68399776075893 )
J Chem Inf Model - Modeling and benchmark data set for the inhibition of c-Jun N-terminal kinase-3. ( 0,683892010856839 )
AMIA Annu Symp Proc - Learning medical diagnosis models from multiple experts. ( 0,683498187690058 )
J Biomed Inform - Neighborhood hash graph kernel for protein-protein interaction extraction. ( 0,682802647330444 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,681181680043602 )
IEEE Trans Image Process - Task-specific image partitioning. ( 0,677022394207556 )
Neural Comput - Multiple spectral kernel learning and a gaussian complexity computation. ( 0,676161700215908 )
Int J Neural Syst - Aggregation of sparse linear discriminant analyses for event-related potential classification in brain-computer interface. ( 0,673525397121 )
Neural Comput - Online learning with (multiple) kernels: a review. ( 0,66470175410144 )
AMIA Annu Symp Proc - Comparison and combination of several MeSH indexing approaches. ( 0,663033730884267 )
Int J Neural Syst - Online semi-supervised growing neural gas. ( 0,662414060156875 )
IEEE Trans Image Process - Multiview Hessian regularization for image annotation. ( 0,660891600746975 )
Int J Neural Syst - Linear time relational prototype based learning. ( 0,660713575134809 )
J Biomed Inform - Classifying temporal relations in clinical data: a hybrid, knowledge-rich approach. ( 0,658540770325434 )
IEEE Trans Image Process - Structured max-margin learning for inter-related classifier training and multilabel image annotation. ( 0,657960111171514 )
J Biomed Inform - Incremental Gaussian Discriminant Analysis based on Graybill and Deal weighted combination of estimators for brain tumour diagnosis. ( 0,65767552026981 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,657414062471288 )
IEEE Trans Image Process - Improving Web image search by bag-based reranking. ( 0,651285545408063 )
BMC Med Inform Decis Mak - Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features. ( 0,650549829427712 )
Neural Comput - Incremental learning by message passing in hierarchical temporal memory. ( 0,64955369762172 )
J Med Syst - 3D matrix pattern based Support Vector Machines for identifying pulmonary cancer in CT scanned images. ( 0,649522262432725 )
J Biomed Inform - Learning classification models from multiple experts. ( 0,649464086446598 )
IEEE Trans Pattern Anal Mach Intell - Label Consistent K-SVD: Learning A Discriminative Dictionary for Recognition. ( 0,64885419040298 )
AMIA Annu Symp Proc - Classification of medication status change in clinical narratives. ( 0,64692431694172 )
IEEE Trans Pattern Anal Mach Intell - Weakly Supervised Recognition of Daily Life Activities with Wearable Sensors. ( 0,646575277455924 )
J Biomed Inform - Temporal relation discovery between events and temporal expressions identified in clinical narrative. ( 0,646379878953775 )
IEEE Trans Image Process - Enhancing training collections for image annotation: an instance-weighted mixture modeling approach. ( 0,642852496878929 )
IEEE Trans Pattern Anal Mach Intell - The Effect of Model Misspecification on Semi-Supervised Classification. ( 0,642745091940106 )
Neural Comput - Representing objects, relations, and sequences. ( 0,642204277373825 )
Neural Comput - Large margin low rank tensor analysis. ( 0,64196590575101 )
IEEE Trans Image Process - Hyperspectral image classification through bilayer graph-based learning. ( 0,640408455279276 )
Neural Comput - Unsupervised learning of generative and discriminative weights encoding elementary image components in a predictive coding model of cortical function. ( 0,635648974788353 )
Artif Intell Med - Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. ( 0,632863419931187 )
AMIA Annu Symp Proc - Automatically Detecting Acute Myocardial Infarction Events from EHR Text: A Preliminary Study. ( 0,630069784615885 )
IEEE Trans Neural Netw Learn Syst - ML-Tree: a tree-structure-based approach to multilabel learning. ( 0,629767231117651 )
IEEE Trans Pattern Anal Mach Intell - Representation Learning: A Review and New Perspectives. ( 0,627335152386636 )
Artif Intell Med - Exploring a corpus-based approach for detecting language impairment in monolingual English-speaking children. ( 0,626781131427636 )
J Biomed Inform - Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing. ( 0,623933727686241 )
J Chem Inf Model - Note on naive Bayes based on binary descriptors in cheminformatics. ( 0,622717962082727 )
IEEE Trans Image Process - Artistic image analysis using graph-based learning approaches. ( 0,620625483941961 )
Comput Methods Programs Biomed - Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain-computer interface. ( 0,61959038041694 )
J Am Med Inform Assoc - 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. ( 0,618587278517872 )
J Biomed Inform - Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms. ( 0,617810192682685 )
J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE. ( 0,6151334394447 )
J Am Med Inform Assoc - Supervised machine learning and active learning in classification of radiology reports. ( 0,611911063488823 )
IEEE Trans Image Process - Real-time probabilistic covariance tracking with efficient model update. ( 0,611602815649607 )
J Am Med Inform Assoc - Missing values in deduplication of electronic patient data. ( 0,610652416505435 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection and Kernel Learning for Local Learning-Based Clustering. ( 0,609676238733294 )
IEEE Trans Pattern Anal Mach Intell - Learning Categories from Few Examples with Multi Model Knowledge Transfer. ( 0,608394476776653 )
AMIA Annu Symp Proc - Sample-efficient learning with auxiliary class-label information. ( 0,607400665996996 )
IEEE Trans Image Process - A Probabilistic Associative Model for Segmenting Weakly-Supervised Images. ( 0,605943562232746 )
AMIA Annu Symp Proc - Hyperdimensional computing approach to word sense disambiguation. ( 0,605430553628949 )
IEEE Trans Pattern Anal Mach Intell - Facial Age Estimation by Learning from Label Distributions. ( 0,604555987499704 )
IEEE Trans Neural Netw Learn Syst - An efficient topological distance-based tree kernel. ( 0,600255408347169 )
Neural Comput - Extended robust support vector machine based on financial risk minimization. ( 0,599839731103138 )
IEEE Trans Image Process - Supervised ordering in IRp: application to morphological processing of hyperspectral images. ( 0,597766762313862 )
IEEE Trans Image Process - Learning discriminative dictionary for group sparse representation. ( 0,596973767167068 )
Artif Intell Med - Multi-objective evolutionary algorithms for fuzzy classification in survival prediction. ( 0,59651038714825 )
IEEE Trans Image Process - Incremental training of a detector using online sparse eigendecomposition. ( 0,594506215197945 )
Comput Methods Programs Biomed - Alternatives to relational database: comparison of NoSQL and XML approaches for clinical data storage. ( 0,592044631002509 )
J Am Med Inform Assoc - Applying active learning to high-throughput phenotyping algorithms for electronic health records data. ( 0,591223972808819 )
IEEE Trans Image Process - Self-supervised online metric learning with low rank constraint for scene categorization. ( 0,590974791770347 )
J Chem Inf Model - Classifying large chemical data sets: using a regularized potential function method. ( 0,589920639510323 )
Int J Neural Syst - Epileptic EEG classification based on kernel sparse representation. ( 0,589139987416521 )