J. Comput. Biol. - Imbalanced class learning in epigenetics.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ can(774) often(719) complex(702) }
{ howev(809) still(633) remain(590) }
{ age(1611) year(1155) adult(843) }
{ use(1733) differ(960) four(931) }
{ method(1219) similar(1157) match(930) }
{ case(1353) use(1143) diagnosi(1136) }
{ spatial(1525) area(1432) region(1030) }
{ model(3404) distribut(989) bayesian(671) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ problem(2511) optim(1539) algorithm(950) }
{ studi(1410) differ(1259) use(1210) }
{ research(1085) discuss(1038) issu(1018) }
{ signal(2180) analysi(812) frequenc(800) }
{ data(1737) use(1416) pattern(1282) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ general(901) number(790) one(736) }
{ featur(1941) imag(1645) propos(1176) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ cost(1906) reduc(1198) effect(832) }
{ sampl(1606) size(1419) use(1276) }
{ drug(1928) target(777) effect(648) }
{ method(1969) cluster(1462) data(1082) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ health(3367) inform(1360) care(1135) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ group(2977) signific(1463) compar(1072) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

In machine learning, one of the important criteria for higher classification accuracy is a balanced dataset. Datasets with a large ratio between minority and majority classes face hindrance in learning using any classifier. Datasets having a magnitude difference in number of instances between the target concept result in an imbalanced class distribution. Such datasets can range from biological data, sensor data, medical diagnostics, or any other domain where labeling any instances of the minority class can be time-consuming or costly or the data may not be easily available. The current study investigates a number of imbalanced class algorithms for solving the imbalanced class distribution present in epigenetic datasets. Epigenetic (DNA methylation) datasets inherently come with few differentially DNA methylated regions (DMR) and with a higher number of non-DMR sites. For this class imbalance problem, a number of algorithms are compared, including the TAN+AdaBoost algorithm. Experiments performed on four epigenetic datasets and several known datasets show that an imbalanced dataset can have similar accuracy as a regular learner on a balanced dataset.

Resumo Limpo

machin learn one import criteria higher classif accuraci balanc dataset dataset larg ratio minor major class face hindranc learn use classifi dataset magnitud differ number instanc target concept result imbalanc class distribut dataset can rang biolog data sensor data medic diagnost domain label instanc minor class can timeconsum cost data may easili avail current studi investig number imbalanc class algorithm solv imbalanc class distribut present epigenet dataset epigenet dna methyl dataset inher come differenti dna methyl region dmr higher number nondmr site class imbal problem number algorithm compar includ tanadaboost algorithm experi perform four epigenet dataset sever known dataset show imbalanc dataset can similar accuraci regular learner balanc dataset

Resumos Similares

Comput Math Methods Med - On multilabel classification methods of incompletely labeled biomedical text data. ( 0,907881383332196 )
IEEE J Biomed Health Inform - Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare. ( 0,860951922033025 )
J Biomed Inform - Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. ( 0,858970446648666 )
J Am Med Inform Assoc - Learning classification models with soft-label information. ( 0,855040132181053 )
IEEE Trans Image Process - Manifold regularized multitask learning for semi-supervised multilabel image classification. ( 0,850545049399179 )
Neural Comput - Adaptive metric learning vector quantization for ordinal classification. ( 0,847756529163824 )
J Med Syst - 3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification. ( 0,845634485585782 )
Neural Comput - Computing sparse representations of multidimensional signals using Kronecker bases. ( 0,843923945376992 )
IEEE Trans Image Process - A linear support higher-order tensor machine for classification. ( 0,831770690753865 )
Neural Comput - Reduction from cost-sensitive ordinal ranking to weighted binary classification. ( 0,829077262708057 )
IEEE Trans Image Process - Multiview Hessian regularization for image annotation. ( 0,825705605093552 )
IEEE Trans Neural Netw Learn Syst - A Kernel Classification Framework for Metric Learning. ( 0,825026115003387 )
IEEE Trans Image Process - Geodesic propagation for semantic labeling. ( 0,821697723786876 )
IEEE Trans Image Process - Hyperspectral image classification through bilayer graph-based learning. ( 0,812221979434206 )
Int J Neural Syst - Span: spike pattern association neuron for learning spatio-temporal spike patterns. ( 0,8107142653007 )
IEEE Trans Image Process - Task-specific image partitioning. ( 0,808042486758321 )
IEEE Trans Pattern Anal Mach Intell - Distance-Based Image Classification: Generalizing to New Classes at Near Zero Cost. ( 0,807093256712571 )
Neural Comput - Metacognitive learning in a fully complex-valued radial basis function neural network. ( 0,800419522755764 )
IEEE Trans Image Process - Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion. ( 0,800284030900773 )
Int J Neural Syst - Aggregation of sparse linear discriminant analyses for event-related potential classification in brain-computer interface. ( 0,796079318304588 )
IEEE Trans Image Process - Structured max-margin learning for inter-related classifier training and multilabel image annotation. ( 0,794977665890459 )
Neural Comput - Multiple spectral kernel learning and a gaussian complexity computation. ( 0,794245180367981 )
Int J Neural Syst - Linear time relational prototype based learning. ( 0,791683865238103 )
IEEE Trans Image Process - Joint segmentation of images and scanned point cloud in large-scale street scenes with low-annotation cost. ( 0,791447990827234 )
IEEE Trans Pattern Anal Mach Intell - The Effect of Model Misspecification on Semi-Supervised Classification. ( 0,789987109993528 )
AMIA Annu Symp Proc - Comparison and combination of several MeSH indexing approaches. ( 0,786430491912297 )
Int J Neural Syst - Structurally enhanced incremental neural learning for image classification with subgraph extraction. ( 0,784392959138241 )
IEEE Trans Image Process - Improving Web image search by bag-based reranking. ( 0,783185258846212 )
Int J Neural Syst - Online semi-supervised growing neural gas. ( 0,782767719423583 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,782452776699072 )
J Biomed Inform - Class proximity measures--dissimilarity-based classification and display of high-dimensional data. ( 0,780205202160089 )
J Biomed Inform - Learning classification models from multiple experts. ( 0,779094284674428 )
IEEE Trans Image Process - Unsupervised amplitude and texture classification of SAR images with multinomial latent model. ( 0,778271608102691 )
IEEE Trans Pattern Anal Mach Intell - Facial Age Estimation by Learning from Label Distributions. ( 0,777944610924902 )
J Am Med Inform Assoc - Active learning for clinical text classification: is it better than random sampling? ( 0,777025073633358 )
IEEE Trans Neural Netw Learn Syst - Adaptive Batch Mode Active Learning. ( 0,776214743827733 )
Comput. Biol. Med. - Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors. ( 0,775115692146073 )
Neural Comput - Representing objects, relations, and sequences. ( 0,773051083610175 )
Neural Comput - Online learning with (multiple) kernels: a review. ( 0,76981450085518 )
BMC Med Inform Decis Mak - Learning to improve medical decision making from imbalanced data without a priori cost. ( 0,768560004667172 )
IEEE Trans Pattern Anal Mach Intell - Weakly Supervised Recognition of Daily Life Activities with Wearable Sensors. ( 0,764681710540773 )
J Biomed Inform - Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms. ( 0,757310756458978 )
IEEE Trans Image Process - Incremental training of a detector using online sparse eigendecomposition. ( 0,757222150272106 )
J Biomed Inform - Incremental Gaussian Discriminant Analysis based on Graybill and Deal weighted combination of estimators for brain tumour diagnosis. ( 0,756793601370138 )
J Biomed Inform - Portable automatic text classification for adverse drug reaction detection via multi-corpus training. ( 0,756494508002641 )
IEEE Trans Image Process - Self-supervised online metric learning with low rank constraint for scene categorization. ( 0,754865421174359 )
IEEE Trans Pattern Anal Mach Intell - Representation Learning: A Review and New Perspectives. ( 0,753803662999457 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data. ( 0,753673705072505 )
IEEE Trans Pattern Anal Mach Intell - Label Consistent K-SVD: Learning A Discriminative Dictionary for Recognition. ( 0,752592073461436 )
Neural Comput - Divergence-based vector quantization. ( 0,752311536900241 )
IEEE Trans Image Process - Saliency and gist features for target detection in satellite images. ( 0,745878941636607 )
J Chem Inf Model - Note on naive Bayes based on binary descriptors in cheminformatics. ( 0,743871663160344 )
J. Comput. Biol. - Locally learning biomedical data using diffusion frames. ( 0,739919403404358 )
IEEE Trans Pattern Anal Mach Intell - Learning Categories from Few Examples with Multi Model Knowledge Transfer. ( 0,739555811066259 )
J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. ( 0,736929202558674 )
IEEE Trans Neural Netw Learn Syst - ML-Tree: a tree-structure-based approach to multilabel learning. ( 0,735921970572994 )
Comput Methods Programs Biomed - Multistage approach for clustering and classification of ECG data. ( 0,731850301254487 )
Comput Methods Programs Biomed - Biomedical system based on the Discrete Hidden Markov Model using the Rocchio-Genetic approach for the classification of internal carotid artery Doppler signals. ( 0,730069702451584 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,727942416023433 )
Artif Intell Med - A classifier ensemble approach for the missing feature problem. ( 0,722402840854527 )
J Biomed Inform - Applying active learning to assertion classification of concepts in clinical text. ( 0,718688996965358 )
J Chem Inf Model - Atom environment kernels on molecules. ( 0,715675071113709 )
IEEE Trans Image Process - Artistic image analysis using graph-based learning approaches. ( 0,715567370664782 )
IEEE Trans Image Process - Design of non-linear kernel dictionaries for object recognition. ( 0,714731341382546 )
Neural Comput - Mismatched training and test distributions can outperform matched ones. ( 0,712753332120219 )
IEEE Trans Image Process - Cross-Device Automated Prostate Cancer Localization With Multiparametric MRI. ( 0,712332807937898 )
Neural Comput - Unsupervised learning of generative and discriminative weights encoding elementary image components in a predictive coding model of cortical function. ( 0,71123107600177 )
Comput Methods Programs Biomed - Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain-computer interface. ( 0,710715877635276 )
IEEE Trans Neural Netw Learn Syst - An efficient topological distance-based tree kernel. ( 0,709272582604582 )
Int J Comput Assist Radiol Surg - Statistical shape model of a liver for autopsy imaging. ( 0,707236583745488 )
IEEE Trans Pattern Anal Mach Intell - Exemplar-Based Colour Constancy and Multiple Illumination. ( 0,705886337383979 )
IEEE Trans Image Process - Learning discriminative dictionary for group sparse representation. ( 0,7041596148914 )
IEEE Trans Pattern Anal Mach Intell - A Bag-of-Features Framework to Classify Time Series. ( 0,704004620273743 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection and Kernel Learning for Local Learning-Based Clustering. ( 0,702372818098 )
AMIA Annu Symp Proc - Classification of medication status change in clinical narratives. ( 0,701512609840781 )
IEEE Trans Image Process - Multiple-kernel, multiple-instance similarity features for efficient visual object detection. ( 0,69764759393673 )
J Biomed Inform - Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study. ( 0,695197023156916 )
Neural Comput - Adaptive multiclass classification for brain computer interfaces. ( 0,695039312057911 )
AMIA Annu Symp Proc - Outlier Detection with One-Class SVMs: An Application to Melanoma Prognosis. ( 0,692799198243315 )
J Chem Inf Model - Modeling and benchmark data set for the inhibition of c-Jun N-terminal kinase-3. ( 0,692739143843731 )
Med Decis Making - The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes. ( 0,692515203048151 )
IEEE Trans Image Process - Supervised ordering in IRp: application to morphological processing of hyperspectral images. ( 0,691783686221566 )
IEEE Trans Pattern Anal Mach Intell - Scene-Specific Pedestrian Detection for Static Video Surveillance. ( 0,689913895563387 )
Neural Comput - Enhanced gradient for training restricted Boltzmann machines. ( 0,68979167061907 )
AMIA Annu Symp Proc - Sample-efficient learning with auxiliary class-label information. ( 0,688270807098863 )
Comput. Biol. Med. - EEG-based emotion estimation using Bayesian weighted-log-posterior function and perceptron convergence algorithm. ( 0,685545131519821 )
IEEE Trans Image Process - Contextual kernel and spectral methods for learning the semantics of images. ( 0,684478681020934 )
Comput Methods Programs Biomed - Comparison of machine learning methods for classifying aphasic and non-aphasic speakers. ( 0,68149742190899 )
IEEE Trans Neural Netw Learn Syst - Partially shared latent factor learning with multiview data. ( 0,675679958854109 )
J Chem Inf Model - Classifying large chemical data sets: using a regularized potential function method. ( 0,674457514678008 )
IEEE Trans Neural Netw Learn Syst - An Experimentation Platform for On-Chip Integration of Analog Neural Networks: A Pathway to Trusted and Robust Analog/RF ICs. ( 0,673017014526355 )
BMC Med Inform Decis Mak - Towards case-based medical learning in radiological decision making using content-based image retrieval. ( 0,665128827114354 )
AMIA Annu Symp Proc - Learning medical diagnosis models from multiple experts. ( 0,664064905114933 )
Neural Comput - Incremental learning by message passing in hierarchical temporal memory. ( 0,661443287932583 )
J. Comput. Biol. - The irredundant class method for remote homology detection of protein sequences. ( 0,658160610785939 )
Artif Intell Med - Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. ( 0,656933370329838 )
J Chem Inf Model - An unbiased method to build benchmarking sets for ligand-based virtual screening and its application to GPCRs. ( 0,655823118360543 )
IEEE Trans Image Process - Real-time object tracking via online discriminative feature selection. ( 0,65578809020658 )
IEEE Trans Pattern Anal Mach Intell - Trainable Convolution Filters and Their Application to Face Recognition. ( 0,655372369814475 )
IEEE Trans Neural Netw Learn Syst - Online Sequential Extreme Learning Machine With Kernels. ( 0,652868387904873 )