J Chem Inf Model - Beyond the scope of Free-Wilson analysis: building interpretable QSAR models with machine learning algorithms.

Tópicos

{ model(2656) set(1616) predict(1553) }
{ analysi(2126) use(1163) compon(1037) }
{ featur(3375) classif(2383) classifi(1994) }
{ can(981) present(881) function(850) }
{ perform(1367) use(1326) method(1137) }
{ method(2212) result(1239) propos(1039) }
{ measur(2081) correl(1212) valu(896) }
{ method(1557) propos(1049) approach(1037) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3008) multipl(1320) sourc(1022) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ general(901) number(790) one(736) }
{ compound(1573) activ(1297) structur(1058) }
{ studi(1119) effect(1106) posit(819) }
{ drug(1928) target(777) effect(648) }
{ model(3404) distribut(989) bayesian(671) }
{ treatment(1704) effect(941) patient(846) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ method(984) reconstruct(947) comput(926) }
{ research(1085) discuss(1038) issu(1018) }
{ research(1218) medic(880) student(794) }
{ age(1611) year(1155) adult(843) }
{ cost(1906) reduc(1198) effect(832) }
{ use(2086) technolog(871) perceiv(783) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ use(1733) differ(960) four(931) }
{ decis(3086) make(1611) patient(1517) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ network(2748) neural(1063) input(814) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ health(1844) social(1437) communiti(874) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

A novel methodology was developed to build Free-Wilson like local QSAR models by combining R-group signatures and the SVM algorithm. Unlike Free-Wilson analysis this method is able to make predictions for compounds with R-groups not present in a training set. Eleven public data sets were chosen as test cases for comparing the performance of our new method with several other traditional modeling strategies, including Free-Wilson analysis. Our results show that the R-group signature SVM models achieve better prediction accuracy compared with Free-Wilson analysis in general. Moreover, the predictions of R-group signature models are also comparable to the models using ECFP6 fingerprints and signatures for the whole compound. Most importantly, R-group contributions to the SVM model can be obtained by calculating the gradient for R-group signatures. For most of the studied data sets, a significant correlation with that of a corresponding Free-Wilson analysis is shown. These results suggest that the R-group contribution can be used to interpret bioactivity data and highlight that the R-group signature based SVM modeling method is as interpretable as Free-Wilson analysis. Hence the signature SVM model can be a useful modeling tool for any drug discovery project.

Resumo Limpo

novel methodolog develop build freewilson like local qsar model combin rgroup signatur svm algorithm unlik freewilson analysi method abl make predict compound rgroup present train set eleven public data set chosen test case compar perform new method sever tradit model strategi includ freewilson analysi result show rgroup signatur svm model achiev better predict accuraci compar freewilson analysi general moreov predict rgroup signatur model also compar model use ecfp fingerprint signatur whole compound import rgroup contribut svm model can obtain calcul gradient rgroup signatur studi data set signific correl correspond freewilson analysi shown result suggest rgroup contribut can use interpret bioactiv data highlight rgroup signatur base svm model method interpret freewilson analysi henc signatur svm model can use model tool drug discoveri project

Resumos Similares

J Chem Inf Model - Time-split cross-validation as a method for estimating the goodness of prospective prediction. ( 0,860248241288304 )
Artif Intell Med - Training artificial neural networks directly on the concordance index for censored data using genetic algorithms. ( 0,854288570566926 )
J Chem Inf Model - Predicting pK(a) values of substituted phenols from atomic charges: comparison of different quantum mechanical methods and charge distribution schemes. ( 0,853315831389181 )
AMIA Annu Symp Proc - Effect of data combination on predictive modeling: a study using gene expression data. ( 0,846454538634447 )
J Chem Inf Model - Study of chromatographic retention of natural terpenoids by chemoinformatic tools. ( 0,799847791381407 )
J Chem Inf Model - Does rational selection of training and test sets improve the outcome of QSAR modeling? ( 0,798620009510963 )
J Chem Inf Model - iLOGP: a simple, robust, and efficient description of n-octanol/water partition coefficient for drug design using the GB/SA approach. ( 0,797929178854057 )
J Chem Inf Model - RS-Predictor models augmented with SMARTCyp reactivities: robust metabolic regioselectivity predictions for nine CYP isozymes. ( 0,794781150088403 )
J Chem Inf Model - Three useful dimensions for domain applicability in QSAR models using random forest. ( 0,774884998553253 )
Artif Intell Med - Fuzzy model identification of dengue epidemic in Colombia based on multiresolution analysis. ( 0,767032516119485 )
AMIA Annu Symp Proc - Motivating the additional use of external validity: examining transportability in a model of glioblastoma multiforme. ( 0,760482367497503 )
Comput. Biol. Med. - A prediction model of substrates and non-substrates of breast cancer resistance protein (BCRP) developed by GA-CG-SVM method. ( 0,759125277171481 )
J Chem Inf Model - GRID-based three-dimensional pharmacophores II: PharmBench, a benchmark data set for evaluating pharmacophore elucidation methods. ( 0,754086293623331 )
BMC Med Inform Decis Mak - Regression tree construction by bootstrap: model search for DRG-systems applied to Austrian health-data. ( 0,753954983842047 )
Int J Health Geogr - Incorporating geographical factors with artificial neural networks to predict reference values of erythrocyte sedimentation rate. ( 0,746349015808644 )
BMC Med Inform Decis Mak - Concordance and predictive value of two adverse drug event data sets. ( 0,745648795734535 )
J Am Med Inform Assoc - Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery. ( 0,734035035405618 )
AMIA Annu Symp Proc - Predicting the dengue incidence in Singapore using univariate time series models. ( 0,728855016248792 )
AMIA Annu Symp Proc - Advanced proficiency EHR training: effect on physicians' EHR efficiency, EHR satisfaction and job satisfaction. ( 0,728719676607727 )
J Chem Inf Model - Applicability domain based on ensemble learning in classification and regression analyses. ( 0,720306161143389 )
J Chem Inf Model - Pharmacophore assessment through 3-D QSAR: evaluation of the predictive ability on new derivatives by the application on a series of antitubercular agents. ( 0,719193357594366 )
J Chem Inf Model - Comparative studies on some metrics for external validation of QSPR models. ( 0,713476878540358 )
J. Med. Internet Res. - A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. ( 0,711698366664964 )
J Chem Inf Model - In silico prediction of aqueous solubility using simple QSPR models: the importance of phenol and phenol-like moieties. ( 0,701047342984968 )
J. Comput. Biol. - The complexity of the dirichlet model for multiple alignment data. ( 0,699244019281554 )
J Chem Inf Model - In silico prediction of chemical Ames mutagenicity. ( 0,696358349153748 )
Med Biol Eng Comput - Application of the RIMARC algorithm to a large data set of action potentials and clinical parameters for risk prediction of atrial fibrillation. ( 0,690272589768337 )
J Chem Inf Model - Prediction of linear cationic antimicrobial peptides based on characteristics responsible for their interaction with the membranes. ( 0,689670510319455 )
Comput Methods Programs Biomed - A 5-component mathematical model for salt-induced hypertension in Dahl-S and Dahl-R rats. ( 0,687457883555237 )
J Chem Inf Model - A new approach to radial basis function approximation and its application to QSAR. ( 0,685658728373065 )
J Am Med Inform Assoc - Harvest: an open platform for developing web-based biomedical data discovery and reporting applications. ( 0,685060607329224 )
BMC Med Inform Decis Mak - Measuring preferences for analgesic treatment for cancer pain: how do African-Americans and Whites perform on choice-based conjoint (CBC) analysis experiments? ( 0,683534493983714 )
J Chem Inf Model - Rank order entropy: why one metric is not enough. ( 0,68300170914024 )
Comput. Aided Surg. - Evaluation of a computational model to predict elbow range of motion. ( 0,681466275636382 )
J Chem Inf Model - In silico prediction of total human plasma clearance. ( 0,678721596355696 )
J Chem Inf Model - Coping with unbalanced class data sets in oral absorption models. ( 0,676976318029155 )
J Chem Inf Model - Classification of compounds with distinct or overlapping multi-target activities and diverse molecular mechanisms using emerging chemical patterns. ( 0,676410367578085 )
J Chem Inf Model - Statistical analysis and compound selection of combinatorial libraries for soluble epoxide hydrolase. ( 0,675734782713476 )
IEEE Trans Image Process - Neighborhood Supported Model Level Fuzzy Aggregation for Moving Object Segmentation. ( 0,674699812550421 )
J Chem Inf Model - Oversampling to overcome overfitting: exploring the relationship between data set composition, molecular descriptors, and predictive modeling methods. ( 0,669090166911116 )
J Chem Inf Model - Impact of template choice on homology model efficiency in virtual screening. ( 0,668087372774139 )
J Chem Inf Model - Best of both worlds: combining pharma data and state of the art modeling technology to improve in Silico pKa prediction. ( 0,666071915788368 )
J Biomed Inform - Selection of interdependent genes via dynamic relevance analysis for cancer diagnosis. ( 0,665512264646162 )
Comput Methods Programs Biomed - A predictive model of longitudinal, patient-specific colonoscopy results. ( 0,665167057492442 )
Comput. Biol. Med. - Artificial neural network modelling of the results of tympanoplasty in chronic suppurative otitis media patients. ( 0,665113721427831 )
Comput Math Methods Med - Multiscale autoregressive identification of neuroelectrophysiological systems. ( 0,660378809689172 )
J Chem Inf Model - Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods. ( 0,660136486644762 )
J Chem Inf Model - Hsp90 inhibitors, part 1: definition of 3-D QSAutogrid/R models as a tool for virtual screening. ( 0,659188382437403 )
Int J Health Geogr - Comparative analysis of remotely-sensed data products via ecological niche modeling of avian influenza case occurrences in Middle Eastern poultry. ( 0,658480124047504 )
J Chem Inf Model - Criterion for evaluating the predictive ability of nonlinear regression models without cross-validation. ( 0,65242363311269 )
J Biomed Inform - MysiRNA: improving siRNA efficacy prediction using a machine-learning model combining multi-tools and whole stacking energy (G). ( 0,649528164512983 )
Neural Comput - Kernels for longitudinal data with variable sequence length and sampling intervals. ( 0,648722909637445 )
Brief. Bioinformatics - An empirical assessment of validation practices for molecular classifiers. ( 0,648712549138651 )
Med Biol Eng Comput - Validating motor unit firing patterns extracted by EMG signal decomposition. ( 0,645591553935025 )
IEEE Trans Pattern Anal Mach Intell - Specificity: A Graph-Based Estimator of Divergence. ( 0,645026410844625 )
J Chem Inf Model - Development of novel 3D-QSAR combination approach for screening and optimizing B-Raf inhibitors in silico. ( 0,644995332257065 )
J Chem Inf Model - Estimation of carcinogenicity using molecular fragments tree. ( 0,644149060768856 )
Spat Spatiotemporal Epidemiol - Spatial modelling of disease using data- and knowledge-driven approaches. ( 0,641365973648403 )
J Chem Inf Model - Real external predictivity of QSAR models. Part 2. New intercomparable thresholds for different validation criteria and the need for scatter plot inspection. ( 0,639529771014122 )
Int J Comput Assist Radiol Surg - Assessing performance in brain tumor resection using a novel virtual reality simulator. ( 0,639142645019961 )
IEEE Trans Image Process - Incremental N-mode SVD for large-scale multilinear generative models. ( 0,638923473886592 )
Comput Methods Programs Biomed - Kinetic modelling of haemodialysis removal of myoglobin in rhabdomyolysis patients. ( 0,637767098093367 )
J Chem Inf Model - Combined 3D-QSAR, molecular docking, and molecular dynamics study on piperazinyl-glutamate-pyridines/pyrimidines as potent P2Y12 antagonists for inhibition of platelet aggregation. ( 0,6360373023912 )
Med Decis Making - Developing a tuberculosis transmission model that accounts for changes in population health. ( 0,631512350234143 )
J Chem Inf Model - Ligand and structure-based classification models for prediction of P-glycoprotein inhibitors. ( 0,630816488407753 )
J Chem Inf Model - Real external predictivity of QSAR models: how to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient. ( 0,630025291444815 )
J Chem Inf Model - Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets. ( 0,621992352554885 )
J Chem Inf Model - Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions. ( 0,619546403124548 )
J Chem Inf Model - Classifier ensemble based on feature selection and diversity measures for predicting the affinity of A(2B) adenosine receptor antagonists. ( 0,612622536762254 )
Comput Methods Programs Biomed - Predicting body fat percentage based on gender, age and BMI by using artificial neural networks. ( 0,612383156407899 )
J Chem Inf Model - Predicting myelosuppression of drugs from in silico models. ( 0,610964308830266 )
Comput. Biol. Med. - Extracting predictive SNPs in Crohn's disease using a vacillating genetic algorithm and a neural classifier in case-control association studies. ( 0,60944390678972 )
J Chem Inf Model - CSAR data set release 2012: ligands, affinities, complexes, and docking decoys. ( 0,60814096570916 )
J Chem Inf Model - How accurately can we predict the melting points of drug-like compounds? ( 0,6062339322801 )
J Chem Inf Model - Binary classification of aqueous solubility using support vector machines with reduction and recombination feature selection. ( 0,605189119176015 )
Int J Comput Assist Radiol Surg - Hybrid image visualization tool for 3D integration of CT coronary anatomy and quantitative myocardial perfusion PET. ( 0,60173948185959 )
J Chem Inf Model - Robust scoring functions for protein-ligand interactions with quantum chemical charge models. ( 0,600379468924441 )
IEEE Trans Image Process - The segmentation of the left ventricle of the heart from ultrasound data using deep learning architectures and derivative-based search methods. ( 0,599156527411173 )
Med Decis Making - Prediction of health preference values from CD4 counts in individuals with HIV. ( 0,599028953434305 )
Med Biol Eng Comput - Optimal design of clinical tests for the identification of physiological models of type 1 diabetes in the presence of model mismatch. ( 0,598762756923697 )
J Chem Inf Model - Four-dimensional structure-activity relationship model to predict HIV-1 integrase strand transfer inhibition using LQTA-QSAR methodology. ( 0,597546407238545 )
J. Comput. Biol. - An almost optimal algorithm for generalized threshold group testing with inhibitors. ( 0,595273807306393 )
Neural Comput - Molecular diffusion model of neurotransmitter homeostasis around synapses supporting gradients. ( 0,595147902798014 )
J Am Med Inform Assoc - Use of a support vector machine for categorizing free-text notes: assessment of accuracy across two institutions. ( 0,594895223225313 )
J Chem Inf Model - A multiscale simulation system for the prediction of drug-induced cardiotoxicity. ( 0,594073434948997 )
Int J Neural Syst - Multichannel decoding for phase-coded SSVEP brain-computer interface. ( 0,593709521903228 )
J Chem Inf Model - Introducing conformal prediction in predictive modeling. A transparent and flexible alternative to applicability domain determination. ( 0,593281049503298 )
Artif Intell Med - Cancer survival classification using integrated data sets and intermediate information. ( 0,592750852514479 )
J Chem Inf Model - Analysis and study of molecule data sets using snowflake diagrams of weighted maximum common subgraph trees. ( 0,59210023298809 )
J Biomed Inform - Transfer learning based clinical concept extraction on data from multiple sources. ( 0,591934979547388 )
Med Biol Eng Comput - Cardiogoniometric parameters for detection of coronary artery disease at rest as a function of stenosis localization and distribution. ( 0,591111778616486 )
AMIA Annu Symp Proc - Identifying Deviations from Usual Medical Care using a Statistical Approach. ( 0,589428935995032 )
J Chem Inf Model - Design of novel FLT-3 inhibitors based on dual-layer 3D-QSAR model and fragment-based compounds in silico. ( 0,587749373309788 )
J Chem Inf Model - A comparison of different QSAR approaches to modeling CYP450 1A2 inhibition. ( 0,584583728462785 )
Comput Methods Programs Biomed - Bayesian bivariate generalized Lindley model for survival data with a cure fraction. ( 0,582596165099359 )
Lifetime Data Anal - Analysis of cure rate survival data under proportional odds model. ( 0,581864016071324 )
J Chem Inf Model - Optimizing predictive performance of CASE Ultra expert system models using the applicability domains of individual toxicity alerts. ( 0,581335187994902 )
J Chem Inf Model - Automated building of organometallic complexes from 3D fragments. ( 0,580939671456994 )
Int J Med Inform - Design and implementation of I2Vote--an interactive image-based voting system using windows mobile devices. ( 0,580702062984438 )
J Chem Inf Model - Predictions of BuChE inhibitors using support vector machine and naive Bayesian classification techniques in drug discovery. ( 0,580640927828538 )