J Biomed Inform - Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ record(1888) medic(1808) patient(1693) }
{ process(1125) use(805) approach(778) }
{ featur(3375) classif(2383) classifi(1994) }
{ studi(1119) effect(1106) posit(819) }
{ method(1219) similar(1157) match(930) }
{ imag(2675) segment(2577) method(1081) }
{ error(1145) method(1030) estim(1020) }
{ concept(1167) ontolog(924) domain(897) }
{ system(1050) medic(1026) inform(1018) }
{ perform(1367) use(1326) method(1137) }
{ data(2317) use(1299) case(1017) }
{ sampl(1606) size(1419) use(1276) }
{ measur(2081) correl(1212) valu(896) }
{ ehr(2073) health(1662) electron(1139) }
{ group(2977) signific(1463) compar(1072) }
{ analysi(2126) use(1163) compon(1037) }
{ use(976) code(926) identifi(902) }
{ detect(2391) sensit(1101) algorithm(908) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1057) registr(996) error(939) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ treatment(1704) effect(941) patient(846) }
{ chang(1828) time(1643) increas(1301) }
{ algorithm(1844) comput(1787) effici(935) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ model(2341) predict(2261) use(1141) }
{ spatial(1525) area(1432) region(1030) }
{ monitor(1329) mobil(1314) devic(1160) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1452) weight(1219) physic(1104) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ take(945) account(800) differ(722) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ clinic(1479) use(1117) guidelin(835) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ data(3963) clinic(1234) research(1004) }
{ perform(999) metric(946) measur(919) }
{ research(1085) discuss(1038) issu(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ blood(1257) pressur(1144) flow(957) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }

Resumo

JECTIVE: To compare linear and Laplacian SVMs on a clinical text classification task; to evaluate the effect of unlabeled training data on Laplacian SVM performance.BACKGROUND: The development of machine-learning based clinical text classifiers requires the creation of labeled training data, obtained via manual review by clinicians. Due to the effort and expense involved in labeling data, training data sets in the clinical domain are of limited size. In contrast, electronic medical record (EMR) systems contain hundreds of thousands of unlabeled notes that are not used by supervised machine learning approaches. Semi-supervised learning algorithms use both labeled and unlabeled data to train classifiers, and can outperform their supervised counterparts.METHODS: We trained support vector machines (SVMs) and Laplacian SVMs on a training reference standard of 820 abdominal CT, MRI, and ultrasound reports labeled for the presence of potentially malignant liver lesions that require follow up (positive class prevalence 77%). The Laplacian SVM used 19,845 randomly sampled unlabeled notes in addition to the training reference standard. We evaluated SVMs and Laplacian SVMs on a test set of 520 labeled reports.RESULTS: The Laplacian SVM trained on labeled and unlabeled radiology reports significantly outperformed supervised SVMs (Macro-F1 0.773 vs. 0.741, Sensitivity 0.943 vs. 0.911, Positive Predictive value 0.877 vs. 0.883). Performance improved with the number of labeled and unlabeled notes used to train the Laplacian SVM (pearson's =0.529 for correlation between number of unlabeled notes and macro-F1 score). These results suggest that practical semi-supervised methods such as the Laplacian SVM can leverage the large, unlabeled corpora that reside within EMRs to improve clinical text classification.

Resumo Limpo

jectiv compar linear laplacian svms clinic text classif task evalu effect unlabel train data laplacian svm performancebackground develop machinelearn base clinic text classifi requir creation label train data obtain via manual review clinician due effort expens involv label data train data set clinic domain limit size contrast electron medic record emr system contain hundr thousand unlabel note use supervis machin learn approach semisupervis learn algorithm use label unlabel data train classifi can outperform supervis counterpartsmethod train support vector machin svms laplacian svms train refer standard abdomin ct mri ultrasound report label presenc potenti malign liver lesion requir follow posit class preval laplacian svm use random sampl unlabel note addit train refer standard evalu svms laplacian svms test set label reportsresult laplacian svm train label unlabel radiolog report signific outperform supervis svms macrof vs sensit vs posit predict valu vs perform improv number label unlabel note use train laplacian svm pearson correl number unlabel note macrof score result suggest practic semisupervis method laplacian svm can leverag larg unlabel corpora resid within emr improv clinic text classif

Resumos Similares

Comput Math Methods Med - On multilabel classification methods of incompletely labeled biomedical text data. ( 0,886739494657824 )
J. Comput. Biol. - Imbalanced class learning in epigenetics. ( 0,858970446648666 )
J Med Syst - 3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification. ( 0,856150439003198 )
J Am Med Inform Assoc - Learning classification models with soft-label information. ( 0,823748084493459 )
Neural Comput - Adaptive metric learning vector quantization for ordinal classification. ( 0,822287577758214 )
IEEE Trans Image Process - A linear support higher-order tensor machine for classification. ( 0,815760615338065 )
J Biomed Inform - Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms. ( 0,811530879259703 )
IEEE Trans Image Process - Manifold regularized multitask learning for semi-supervised multilabel image classification. ( 0,811227407455538 )
IEEE Trans Image Process - Task-specific image partitioning. ( 0,807519748237238 )
Neural Comput - Computing sparse representations of multidimensional signals using Kronecker bases. ( 0,802452384340854 )
IEEE J Biomed Health Inform - Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare. ( 0,800494879628803 )
IEEE Trans Pattern Anal Mach Intell - Distance-Based Image Classification: Generalizing to New Classes at Near Zero Cost. ( 0,799713272909188 )
IEEE Trans Neural Netw Learn Syst - A Kernel Classification Framework for Metric Learning. ( 0,7963801066388 )
Int J Neural Syst - Online semi-supervised growing neural gas. ( 0,794090673336271 )
Neural Comput - Reduction from cost-sensitive ordinal ranking to weighted binary classification. ( 0,794082621138108 )
Comput. Biol. Med. - Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors. ( 0,788906433114359 )
IEEE Trans Image Process - Geodesic propagation for semantic labeling. ( 0,788161247155998 )
Neural Comput - Metacognitive learning in a fully complex-valued radial basis function neural network. ( 0,788000323041865 )
IEEE Trans Image Process - Improving Web image search by bag-based reranking. ( 0,786717003828026 )
J Am Med Inform Assoc - Active learning for clinical text classification: is it better than random sampling? ( 0,785772056785471 )
IEEE Trans Image Process - Structured max-margin learning for inter-related classifier training and multilabel image annotation. ( 0,779001351902448 )
Neural Comput - Online learning with (multiple) kernels: a review. ( 0,776830915535547 )
IEEE Trans Image Process - Joint segmentation of images and scanned point cloud in large-scale street scenes with low-annotation cost. ( 0,774798059562897 )
IEEE Trans Neural Netw Learn Syst - Adaptive Batch Mode Active Learning. ( 0,774216386555281 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,772086057770605 )
J Biomed Inform - Portable automatic text classification for adverse drug reaction detection via multi-corpus training. ( 0,771751169101264 )
BMC Med Inform Decis Mak - Learning to improve medical decision making from imbalanced data without a priori cost. ( 0,77107586019747 )
J Biomed Inform - Incremental Gaussian Discriminant Analysis based on Graybill and Deal weighted combination of estimators for brain tumour diagnosis. ( 0,770537623520599 )
Int J Neural Syst - Aggregation of sparse linear discriminant analyses for event-related potential classification in brain-computer interface. ( 0,76939392661895 )
J Biomed Inform - Learning classification models from multiple experts. ( 0,767633373318586 )
AMIA Annu Symp Proc - Comparison and combination of several MeSH indexing approaches. ( 0,766238835565032 )
Int J Neural Syst - Structurally enhanced incremental neural learning for image classification with subgraph extraction. ( 0,765478358347353 )
J Biomed Inform - Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study. ( 0,764479309295391 )
IEEE Trans Image Process - Hyperspectral image classification through bilayer graph-based learning. ( 0,763913310509803 )
IEEE Trans Image Process - Unsupervised amplitude and texture classification of SAR images with multinomial latent model. ( 0,760605791678589 )
Int J Neural Syst - Linear time relational prototype based learning. ( 0,755837255347387 )
Int J Neural Syst - Span: spike pattern association neuron for learning spatio-temporal spike patterns. ( 0,755604181524001 )
AMIA Annu Symp Proc - Classification of medication status change in clinical narratives. ( 0,754928457256101 )
IEEE Trans Image Process - Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion. ( 0,754378726749304 )
J Am Med Inform Assoc - Supervised machine learning and active learning in classification of radiology reports. ( 0,754374837020345 )
IEEE Trans Image Process - Multiview Hessian regularization for image annotation. ( 0,753831430986078 )
IEEE Trans Neural Netw Learn Syst - ML-Tree: a tree-structure-based approach to multilabel learning. ( 0,753256569166747 )
J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. ( 0,752046331009273 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,751242395389278 )
IEEE Trans Pattern Anal Mach Intell - Weakly Supervised Recognition of Daily Life Activities with Wearable Sensors. ( 0,748932917926868 )
J Biomed Inform - Class proximity measures--dissimilarity-based classification and display of high-dimensional data. ( 0,746435932765982 )
Neural Comput - Divergence-based vector quantization. ( 0,736350351831759 )
IEEE Trans Pattern Anal Mach Intell - Label Consistent K-SVD: Learning A Discriminative Dictionary for Recognition. ( 0,73202582136769 )
Neural Comput - Multiple spectral kernel learning and a gaussian complexity computation. ( 0,731845873642497 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data. ( 0,731092823040912 )
IEEE Trans Image Process - Design of non-linear kernel dictionaries for object recognition. ( 0,729017244338525 )
IEEE Trans Pattern Anal Mach Intell - The Effect of Model Misspecification on Semi-Supervised Classification. ( 0,72062350337515 )
Neural Comput - Unsupervised learning of generative and discriminative weights encoding elementary image components in a predictive coding model of cortical function. ( 0,720135178006948 )
J Am Med Inform Assoc - Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data. ( 0,719852080008369 )
IEEE Trans Pattern Anal Mach Intell - Facial Age Estimation by Learning from Label Distributions. ( 0,717077538264992 )
J Biomed Inform - Applying active learning to assertion classification of concepts in clinical text. ( 0,716188582460879 )
Comput Methods Programs Biomed - Biomedical system based on the Discrete Hidden Markov Model using the Rocchio-Genetic approach for the classification of internal carotid artery Doppler signals. ( 0,714653815152177 )
J. Comput. Biol. - Locally learning biomedical data using diffusion frames. ( 0,714347917843971 )
IEEE Trans Image Process - Saliency and gist features for target detection in satellite images. ( 0,714102346818814 )
IEEE Trans Image Process - Incremental training of a detector using online sparse eigendecomposition. ( 0,71322930520596 )
IEEE Trans Neural Netw Learn Syst - An efficient topological distance-based tree kernel. ( 0,710710565772367 )
IEEE Trans Pattern Anal Mach Intell - Representation Learning: A Review and New Perspectives. ( 0,709146922054766 )
IEEE Trans Image Process - Self-supervised online metric learning with low rank constraint for scene categorization. ( 0,707839347292904 )
Comput Methods Programs Biomed - Multistage approach for clustering and classification of ECG data. ( 0,707281277514354 )
IEEE Trans Image Process - Multiple-kernel, multiple-instance similarity features for efficient visual object detection. ( 0,706289208318474 )
J Chem Inf Model - Note on naive Bayes based on binary descriptors in cheminformatics. ( 0,703484363593909 )
J Chem Inf Model - Modeling and benchmark data set for the inhibition of c-Jun N-terminal kinase-3. ( 0,702818384229878 )
IEEE Trans Image Process - Artistic image analysis using graph-based learning approaches. ( 0,702504981393853 )
Neural Comput - Representing objects, relations, and sequences. ( 0,702201133378762 )
IEEE Trans Pattern Anal Mach Intell - Learning Categories from Few Examples with Multi Model Knowledge Transfer. ( 0,697507185053518 )
AMIA Annu Symp Proc - Sample-efficient learning with auxiliary class-label information. ( 0,695290766403686 )
AMIA Annu Symp Proc - Outlier Detection with One-Class SVMs: An Application to Melanoma Prognosis. ( 0,694903816551709 )
IEEE Trans Image Process - Learning discriminative dictionary for group sparse representation. ( 0,688347561684733 )
Med Decis Making - The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes. ( 0,688063353880782 )
Artif Intell Med - A classifier ensemble approach for the missing feature problem. ( 0,687474467825981 )
IEEE Trans Image Process - Supervised ordering in IRp: application to morphological processing of hyperspectral images. ( 0,682275424796781 )
J Chem Inf Model - Classifying large chemical data sets: using a regularized potential function method. ( 0,678177280089736 )
Neural Comput - Mismatched training and test distributions can outperform matched ones. ( 0,673263773709575 )
Artif Intell Med - A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. ( 0,672042199262438 )
Artif Intell Med - Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. ( 0,671897692715202 )
J Biomed Inform - Active learning strategies for the deduplication of electronic patient data using classification trees. ( 0,67148978155638 )
IEEE Trans Image Process - Cross-Device Automated Prostate Cancer Localization With Multiparametric MRI. ( 0,669853515862445 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection and Kernel Learning for Local Learning-Based Clustering. ( 0,669845529431405 )
Neural Comput - Extended robust support vector machine based on financial risk minimization. ( 0,669666608863696 )
Comput Methods Programs Biomed - Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain-computer interface. ( 0,66842540606025 )
IEEE Trans Pattern Anal Mach Intell - A Bag-of-Features Framework to Classify Time Series. ( 0,668292597316805 )
Neural Comput - Adaptive multiclass classification for brain computer interfaces. ( 0,668117321713534 )
IEEE Trans Image Process - Contextual kernel and spectral methods for learning the semantics of images. ( 0,666530022832084 )
Comput. Biol. Med. - EEG-based emotion estimation using Bayesian weighted-log-posterior function and perceptron convergence algorithm. ( 0,66612092675666 )
Int J Comput Assist Radiol Surg - Statistical shape model of a liver for autopsy imaging. ( 0,661263456435869 )
Comput Methods Programs Biomed - Comparison of machine learning methods for classifying aphasic and non-aphasic speakers. ( 0,660506106039057 )
J Chem Inf Model - Atom environment kernels on molecules. ( 0,657745466657476 )
AMIA Annu Symp Proc - Learning medical diagnosis models from multiple experts. ( 0,654410717204833 )
Methods Inf Med - Probability machines: consistent probability estimation using nonparametric learning machines. ( 0,651883272866737 )
BMC Med Inform Decis Mak - Decision tree-based learning to predict patient controlled analgesia consumption and readjustment. ( 0,650299134748828 )
IEEE Trans Image Process - Fast bilateral filter with arbitrary range and domain kernels. ( 0,649110488296536 )
IEEE Trans Pattern Anal Mach Intell - Trainable Convolution Filters and Their Application to Face Recognition. ( 0,647596130170843 )
Artif Intell Med - Exploiting the systematic review protocol for classification of medical abstracts. ( 0,645420659491392 )
IEEE Trans Pattern Anal Mach Intell - Good Practice in Large-Scale Learning for Image Classification. ( 0,639213986492253 )
J Biomed Inform - Classifying temporal relations in clinical data: a hybrid, knowledge-rich approach. ( 0,638160495785949 )