J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ model(2341) predict(2261) use(1141) }
{ high(1669) rate(1365) level(1280) }
{ featur(3375) classif(2383) classifi(1994) }
{ studi(1119) effect(1106) posit(819) }
{ analysi(2126) use(1163) compon(1037) }
{ general(901) number(790) one(736) }
{ bind(1733) structur(1185) ligand(1036) }
{ search(2224) databas(1162) retriev(909) }
{ data(3963) clinic(1234) research(1004) }
{ compound(1573) activ(1297) structur(1058) }
{ data(1737) use(1416) pattern(1282) }
{ control(1307) perform(991) simul(935) }
{ state(1844) use(1261) util(961) }
{ detect(2391) sensit(1101) algorithm(908) }
{ imag(1057) registr(996) error(939) }
{ method(1219) similar(1157) match(930) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ problem(2511) optim(1539) algorithm(950) }
{ import(1318) role(1303) understand(862) }
{ group(2977) signific(1463) compar(1072) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ drug(1928) target(777) effect(648) }
{ method(2212) result(1239) propos(1039) }
{ sequenc(1873) structur(1644) protein(1328) }
{ take(945) account(800) differ(722) }
{ chang(1828) time(1643) increas(1301) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ perform(999) metric(946) measur(919) }
{ spatial(1525) area(1432) region(1030) }
{ signal(2180) analysi(812) frequenc(800) }
{ gene(2352) biolog(1181) express(1162) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ process(1125) use(805) approach(778) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ data(1714) softwar(1251) tool(1186) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ cost(1906) reduc(1198) effect(832) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ implement(1333) system(1263) develop(1122) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }

Resumo

Machine learning methods based on ligand-protein interaction data in bioactivity databases are one of the current strategies for efficiently finding novel lead compounds as the first step in the drug discovery process. Although previous machine learning studies have succeeded in predicting novel ligand-protein interactions with high performance, all of the previous studies to date have been heavily dependent on the simple use of raw bioactivity data of ligand potencies measured by IC50, EC50, K(i), and K(d) deposited in databases. ChEMBL provides us with a unique opportunity to investigate whether a machine-learning-based classifier created by reflecting ligand efficiency other than the IC50, EC50, K(i), and Kd values can also offer high predictive performance. Here we report that classifiers created from training data based on ligand efficiency show higher performance than those from data based on IC50 or K(i) values. Utilizing GPCRSARfari and KinaseSARfari databases in ChEMBL, we created IC50- or K(i)-based training data and binding efficiency index (BEI) based training data then constructed classifiers using support vector machines (SVMs). The SVM classifiers from the BEI-based training data showed slightly higher area under curve (AUC), accuracy, sensitivity, and specificity in the cross-validation tests. Application of the classifiers to the validation data demonstrated that the AUCs and specificities of the BEI-based classifiers dramatically increased in comparison with the IC50- or K(i)-based classifiers. The improvement of the predictive power by the BEI-based classifiers can be attributed to (i) the more separated distributions of positives and negatives, (ii) the higher diversity of negatives in the BEI-based training data in a feature space of SVMs, and (iii) a more balanced number of positives and negatives in the BEI-based training data. These results strongly suggest that training data based on ligand efficiency as well as data based on classical IC50, EC50, K(d), and K(i) values are important when creating a classifier using a machine learning approach based on bioactivity data.

Resumo Limpo

machin learn method base ligandprotein interact data bioactiv databas one current strategi effici find novel lead compound first step drug discoveri process although previous machin learn studi succeed predict novel ligandprotein interact high perform previous studi date heavili depend simpl use raw bioactiv data ligand potenc measur ic ec ki kd deposit databas chembl provid us uniqu opportun investig whether machinelearningbas classifi creat reflect ligand effici ic ec ki kd valu can also offer high predict perform report classifi creat train data base ligand effici show higher perform data base ic ki valu util gpcrsarfari kinasesarfari databas chembl creat ic kibas train data bind effici index bei base train data construct classifi use support vector machin svms svm classifi beibas train data show slight higher area curv auc accuraci sensit specif crossvalid test applic classifi valid data demonstr auc specif beibas classifi dramat increas comparison ic kibas classifi improv predict power beibas classifi can attribut separ distribut posit negat ii higher divers negat beibas train data featur space svms iii balanc number posit negat beibas train data result strong suggest train data base ligand effici well data base classic ic ec kd ki valu import creat classifi use machin learn approach base bioactiv data

Resumos Similares

AMIA Annu Symp Proc - Outlier Detection with One-Class SVMs: An Application to Melanoma Prognosis. ( 0,804022316122973 )
Med Decis Making - The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes. ( 0,800277386932 )
J Am Med Inform Assoc - Supervised embedding of textual predictors with applications in clinical diagnostics for pediatric cardiology. ( 0,797396610770383 )
J Am Med Inform Assoc - Learning classification models with soft-label information. ( 0,778423556350088 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,777909502793258 )
J Med Syst - 3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification. ( 0,773027490133549 )
Comput Math Methods Med - On multilabel classification methods of incompletely labeled biomedical text data. ( 0,771690207025848 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,759895526143269 )
J Biomed Inform - Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. ( 0,752046331009272 )
IEEE J Biomed Health Inform - Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare. ( 0,747268560121959 )
IEEE Trans Neural Netw Learn Syst - A Kernel Classification Framework for Metric Learning. ( 0,744808873696383 )
IEEE Trans Image Process - Manifold regularized multitask learning for semi-supervised multilabel image classification. ( 0,743799991340637 )
Neural Comput - Adaptive metric learning vector quantization for ordinal classification. ( 0,743049827253287 )
J Chem Inf Model - Pragmatic approaches to using computational methods to predict xenobiotic metabolism. ( 0,742708080203627 )
IEEE Trans Image Process - Task-specific image partitioning. ( 0,740966619375625 )
IEEE Trans Image Process - Unsupervised amplitude and texture classification of SAR images with multinomial latent model. ( 0,73904798790976 )
J. Comput. Biol. - Imbalanced class learning in epigenetics. ( 0,736929202558674 )
AMIA Annu Symp Proc - Learning medical diagnosis models from multiple experts. ( 0,736667305917682 )
J Am Med Inform Assoc - Active learning for clinical text classification: is it better than random sampling? ( 0,732726895439945 )
Artif Intell Med - Prediction of human major histocompatibility complex class II binding peptides by continuous kernel discrimination method. ( 0,727328462480172 )
IEEE Trans Image Process - A linear support higher-order tensor machine for classification. ( 0,726220794463554 )
Neural Comput - Reduction from cost-sensitive ordinal ranking to weighted binary classification. ( 0,72436661820143 )
IEEE Trans Image Process - Design of non-linear kernel dictionaries for object recognition. ( 0,723802455892727 )
Neural Comput - Online learning with (multiple) kernels: a review. ( 0,723578620663296 )
Comput Methods Programs Biomed - Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain-computer interface. ( 0,7225210105869 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data. ( 0,721008654560828 )
Neural Comput - Computing sparse representations of multidimensional signals using Kronecker bases. ( 0,718616322470414 )
IEEE Trans Neural Netw Learn Syst - ML-Tree: a tree-structure-based approach to multilabel learning. ( 0,71787774425024 )
Neural Comput - Divergence-based vector quantization. ( 0,717065391004755 )
Int J Neural Syst - Structurally enhanced incremental neural learning for image classification with subgraph extraction. ( 0,715869384572314 )
IEEE Trans Pattern Anal Mach Intell - Weakly Supervised Recognition of Daily Life Activities with Wearable Sensors. ( 0,715257890472354 )
IEEE Trans Pattern Anal Mach Intell - Representation Learning: A Review and New Perspectives. ( 0,71410421998434 )
J Biomed Inform - Applying active learning to assertion classification of concepts in clinical text. ( 0,712043935199871 )
J Chem Inf Model - Classifying large chemical data sets: using a regularized potential function method. ( 0,709194891918418 )
IEEE Trans Image Process - Improving Web image search by bag-based reranking. ( 0,708605371689256 )
Int J Neural Syst - Online semi-supervised growing neural gas. ( 0,70655467078283 )
Int J Neural Syst - Aggregation of sparse linear discriminant analyses for event-related potential classification in brain-computer interface. ( 0,706104200526443 )
Comput Methods Programs Biomed - Machine learning algorithms and forced oscillation measurements applied to the automatic identification of chronic obstructive pulmonary disease. ( 0,704911196193904 )
IEEE Trans Image Process - Saliency and gist features for target detection in satellite images. ( 0,701332420832848 )
Neural Comput - Metacognitive learning in a fully complex-valued radial basis function neural network. ( 0,701310202674683 )
IEEE Trans Image Process - Geodesic propagation for semantic labeling. ( 0,699397789161602 )
J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE. ( 0,699074272118114 )
Int J Neural Syst - Span: spike pattern association neuron for learning spatio-temporal spike patterns. ( 0,697555742603861 )
Neural Comput - Extended robust support vector machine based on financial risk minimization. ( 0,694583030124248 )
IEEE Trans Pattern Anal Mach Intell - Distance-Based Image Classification: Generalizing to New Classes at Near Zero Cost. ( 0,694388496401754 )
IEEE Trans Image Process - Multiview Hessian regularization for image annotation. ( 0,694202688369191 )
Artif Intell Med - Machine learning of clinical performance in a pancreatic cancer database. ( 0,691012225266561 )
Neural Comput - Multiple spectral kernel learning and a gaussian complexity computation. ( 0,689847973093304 )
IEEE Trans Image Process - Structured max-margin learning for inter-related classifier training and multilabel image annotation. ( 0,689550690256438 )
IEEE J Biomed Health Inform - Supervised hierarchical Bayesian model-based electomyographic control and analysis. ( 0,688334162979956 )
Comput. Biol. Med. - Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors. ( 0,686320154359931 )
J Biomed Inform - Class proximity measures--dissimilarity-based classification and display of high-dimensional data. ( 0,685970774200007 )
AMIA Annu Symp Proc - Comparison and combination of several MeSH indexing approaches. ( 0,680850359524656 )
IEEE Trans Image Process - Multiple-kernel, multiple-instance similarity features for efficient visual object detection. ( 0,680799118839153 )
J Biomed Inform - Portable automatic text classification for adverse drug reaction detection via multi-corpus training. ( 0,676447451692442 )
J Biomed Inform - Learning classification models from multiple experts. ( 0,675929092331001 )
BMC Med Inform Decis Mak - Learning to improve medical decision making from imbalanced data without a priori cost. ( 0,673865884667208 )
IEEE Trans Image Process - Hyperspectral image classification through bilayer graph-based learning. ( 0,670938285369701 )
Int J Med Inform - An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics. ( 0,668629080897507 )
IEEE Trans Image Process - Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion. ( 0,668190682711503 )
Neural Comput - Unsupervised learning of generative and discriminative weights encoding elementary image components in a predictive coding model of cortical function. ( 0,667267908374374 )
J Biomed Inform - Incremental Gaussian Discriminant Analysis based on Graybill and Deal weighted combination of estimators for brain tumour diagnosis. ( 0,666633928859865 )
J Chem Inf Model - Atom environment kernels on molecules. ( 0,664950771294858 )
Neural Comput - Representing objects, relations, and sequences. ( 0,664448647346081 )
Int J Neural Syst - Linear time relational prototype based learning. ( 0,664209935480005 )
IEEE Trans Image Process - Incremental training of a detector using online sparse eigendecomposition. ( 0,663769140872182 )
IEEE Trans Image Process - Joint segmentation of images and scanned point cloud in large-scale street scenes with low-annotation cost. ( 0,661375622176906 )
IEEE Trans Pattern Anal Mach Intell - Learning Categories from Few Examples with Multi Model Knowledge Transfer. ( 0,661282436036973 )
IEEE Trans Image Process - Artistic image analysis using graph-based learning approaches. ( 0,659302620648175 )
J Med Syst - A new approach: role of data mining in prediction of survival of burn patients. ( 0,657993811482032 )
IEEE Trans Neural Netw Learn Syst - Adaptive Batch Mode Active Learning. ( 0,656461213420179 )
Neural Comput - Blocked 3?2 cross-validated t-test for comparing supervised classification learning algorithms. ( 0,652697817362895 )
J Biomed Inform - Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms. ( 0,650690006006158 )
J Chem Inf Model - Enhancing the accuracy of chemogenomic models with a three-dimensional binding site kernel. ( 0,649410458596096 )
IEEE Trans Pattern Anal Mach Intell - Label Consistent K-SVD: Learning A Discriminative Dictionary for Recognition. ( 0,648571095869646 )
J Chem Inf Model - Note on naive Bayes based on binary descriptors in cheminformatics. ( 0,64767199832776 )
Artif Intell Med - A classifier ensemble approach for the missing feature problem. ( 0,64714622948449 )
J Biomed Inform - Classification of CT pulmonary angiography reports by presence, chronicity, and location of pulmonary embolism with natural language processing. ( 0,646111788145657 )
IEEE Trans Image Process - Learning discriminative dictionary for group sparse representation. ( 0,645202260294197 )
AMIA Annu Symp Proc - Improving predictions in imbalanced data using Pairwise Expanded Logistic Regression. ( 0,644273298694428 )
J Am Med Inform Assoc - Machine learning for predicting the response of breast cancer to neoadjuvant chemotherapy. ( 0,641520509552677 )
IEEE Trans Neural Netw Learn Syst - Partially shared latent factor learning with multiview data. ( 0,641073941992324 )
J Chem Inf Model - Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis. ( 0,6399622587225 )
Int J Med Inform - Where should electronic records for patients be stored? ( 0,639694462180342 )
Methods Inf Med - Investigating recurrent neural networks for OCT A-scan based tissue analysis. ( 0,639153473816523 )
IEEE Trans Neural Netw Learn Syst - An efficient topological distance-based tree kernel. ( 0,638511864685033 )
IEEE Trans Neural Netw Learn Syst - Hyperparameter Selection for Gaussian Process One-Class Classification. ( 0,637679069375715 )
Neural Comput - Feature selection for ordinal text classification. ( 0,637529135236796 )
J Chem Inf Model - Modeling and benchmark data set for the inhibition of c-Jun N-terminal kinase-3. ( 0,637365000916472 )
Int J Comput Assist Radiol Surg - Statistical shape model of a liver for autopsy imaging. ( 0,6335817260814 )
Artif Intell Med - A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. ( 0,633005821670728 )
J Am Med Inform Assoc - Predicting complications of percutaneous coronary intervention using a novel support vector method. ( 0,632281365161507 )
Comput Methods Programs Biomed - Biomedical system based on the Discrete Hidden Markov Model using the Rocchio-Genetic approach for the classification of internal carotid artery Doppler signals. ( 0,630436548930175 )
Artif Intell Med - Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. ( 0,630325256061614 )
IEEE J Biomed Health Inform - Service-oriented medical system for supporting decisions with missing and imbalanced data. ( 0,629550857314064 )
IEEE Trans Pattern Anal Mach Intell - The Effect of Model Misspecification on Semi-Supervised Classification. ( 0,625107702729577 )
IEEE Trans Pattern Anal Mach Intell - Facial Age Estimation by Learning from Label Distributions. ( 0,625038192138687 )
IEEE Trans Pattern Anal Mach Intell - A Bag-of-Features Framework to Classify Time Series. ( 0,62150183237232 )
IEEE Trans Image Process - Contextual kernel and spectral methods for learning the semantics of images. ( 0,614733718866442 )
Comput Methods Programs Biomed - Multistage approach for clustering and classification of ECG data. ( 0,614297693283292 )