J Chem Inf Model - Three useful dimensions for domain applicability in QSAR models using random forest.

Tópicos

{ model(2656) set(1616) predict(1553) }
{ error(1145) method(1030) estim(1020) }
{ model(3480) simul(1196) paramet(876) }
{ method(1219) similar(1157) match(930) }
{ structur(1116) can(940) graph(676) }
{ featur(3375) classif(2383) classifi(1994) }
{ compound(1573) activ(1297) structur(1058) }
{ can(981) present(881) function(850) }
{ imag(2675) segment(2577) method(1081) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ perform(999) metric(946) measur(919) }
{ studi(1119) effect(1106) posit(819) }
{ time(1939) patient(1703) rate(768) }
{ health(1844) social(1437) communiti(874) }
{ cancer(2502) breast(956) screen(824) }
{ drug(1928) target(777) effect(648) }
{ method(1969) cluster(1462) data(1082) }
{ model(3404) distribut(989) bayesian(671) }
{ data(1737) use(1416) pattern(1282) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ learn(2355) train(1041) set(1003) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ featur(1941) imag(1645) propos(1176) }
{ studi(1410) differ(1259) use(1210) }
{ import(1318) role(1303) understand(862) }
{ perform(1367) use(1326) method(1137) }
{ age(1611) year(1155) adult(843) }
{ sampl(1606) size(1419) use(1276) }
{ analysi(2126) use(1163) compon(1037) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ high(1669) rate(1365) level(1280) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

One popular metric for estimating the accuracy of prospective quantitative structure-activity relationship (QSAR) predictions is based on the similarity of the compound being predicted to compounds in the training set from which the QSAR model was built. More recent work in the field has indicated that other parameters might be equally or more important than similarity. Here we make use of two additional parameters: the variation of prediction among random forest trees (less variation among trees indicates more accurate prediction) and the prediction itself (certain ranges of activity are intrinsically easier to predict than others). The accuracy of prediction for a QSAR model, as measured by the root-mean-square error, can be estimated by cross-validation on the training set at the time of model-building and stored as a three-dimensional array of bins. This is an obvious extension of the one-dimensional array of bins we previously proposed for similarity to the training set [Sheridan et al. J. Chem. Inf. Comput. Sci.2004, 44, 1912-1928]. We show that using these three parameters simultaneously adds much more discrimination in prediction accuracy than any single parameter. This approach can be applied to any QSAR method that produces an ensemble of models. We also show that the root-mean-square errors produced by cross-validation are predictive of root-mean-square errors of compounds tested after the model was built.

Resumo Limpo

one popular metric estim accuraci prospect quantit structureact relationship qsar predict base similar compound predict compound train set qsar model built recent work field indic paramet might equal import similar make use two addit paramet variat predict among random forest tree less variat among tree indic accur predict predict certain rang activ intrins easier predict other accuraci predict qsar model measur rootmeansquar error can estim crossvalid train set time modelbuild store threedimension array bin obvious extens onedimension array bin previous propos similar train set sheridan et al j chem inf comput sci show use three paramet simultan add much discrimin predict accuraci singl paramet approach can appli qsar method produc ensembl model also show rootmeansquar error produc crossvalid predict rootmeansquar error compound test model built

Resumos Similares

J Chem Inf Model - Time-split cross-validation as a method for estimating the goodness of prospective prediction. ( 0,792826930602374 )
J Chem Inf Model - iLOGP: a simple, robust, and efficient description of n-octanol/water partition coefficient for drug design using the GB/SA approach. ( 0,780503253957659 )
Artif Intell Med - Training artificial neural networks directly on the concordance index for censored data using genetic algorithms. ( 0,778407514752623 )
J Chem Inf Model - Beyond the scope of Free-Wilson analysis: building interpretable QSAR models with machine learning algorithms. ( 0,774884998553253 )
BMC Med Inform Decis Mak - Regression tree construction by bootstrap: model search for DRG-systems applied to Austrian health-data. ( 0,755860165359373 )
Comput. Biol. Med. - A prediction model of substrates and non-substrates of breast cancer resistance protein (BCRP) developed by GA-CG-SVM method. ( 0,747548378072258 )
AMIA Annu Symp Proc - Effect of data combination on predictive modeling: a study using gene expression data. ( 0,746459934303089 )
J Chem Inf Model - Comparative studies on some metrics for external validation of QSPR models. ( 0,736913166587864 )
J Chem Inf Model - Study of chromatographic retention of natural terpenoids by chemoinformatic tools. ( 0,73302293054949 )
J Chem Inf Model - Does rational selection of training and test sets improve the outcome of QSAR modeling? ( 0,721851501135653 )
J Chem Inf Model - Predicting pK(a) values of substituted phenols from atomic charges: comparison of different quantum mechanical methods and charge distribution schemes. ( 0,719630565596777 )
J Chem Inf Model - In silico prediction of chemical Ames mutagenicity. ( 0,718593760197781 )
J Chem Inf Model - Design of novel FLT-3 inhibitors based on dual-layer 3D-QSAR model and fragment-based compounds in silico. ( 0,715911373560462 )
J Chem Inf Model - Development of novel 3D-QSAR combination approach for screening and optimizing B-Raf inhibitors in silico. ( 0,715898679736536 )
J Chem Inf Model - Coping with unbalanced class data sets in oral absorption models. ( 0,699834769577295 )
J Chem Inf Model - GRID-based three-dimensional pharmacophores II: PharmBench, a benchmark data set for evaluating pharmacophore elucidation methods. ( 0,69957674588437 )
J Chem Inf Model - Analysis and study of molecule data sets using snowflake diagrams of weighted maximum common subgraph trees. ( 0,698850945408088 )
Comput Methods Programs Biomed - Predicting body fat percentage based on gender, age and BMI by using artificial neural networks. ( 0,692134837232885 )
J Chem Inf Model - Statistical analysis and compound selection of combinatorial libraries for soluble epoxide hydrolase. ( 0,689417512977146 )
J Chem Inf Model - Quantitative structure-activity relationship models for ready biodegradability of chemicals. ( 0,688259907923569 )
J Chem Inf Model - In silico prediction of aqueous solubility using simple QSPR models: the importance of phenol and phenol-like moieties. ( 0,686345189123646 )
J Chem Inf Model - A new approach to radial basis function approximation and its application to QSAR. ( 0,681227055447877 )
J Chem Inf Model - Best of both worlds: combining pharma data and state of the art modeling technology to improve in Silico pKa prediction. ( 0,679317023932821 )
J Chem Inf Model - In silico prediction of total human plasma clearance. ( 0,675613575041845 )
J Chem Inf Model - Real external predictivity of QSAR models. Part 2. New intercomparable thresholds for different validation criteria and the need for scatter plot inspection. ( 0,672929079427563 )
J Chem Inf Model - RS-Predictor models augmented with SMARTCyp reactivities: robust metabolic regioselectivity predictions for nine CYP isozymes. ( 0,671351736218986 )
J Chem Inf Model - Classification of compounds with distinct or overlapping multi-target activities and diverse molecular mechanisms using emerging chemical patterns. ( 0,669060427193831 )
AMIA Annu Symp Proc - Motivating the additional use of external validity: examining transportability in a model of glioblastoma multiforme. ( 0,668201443636557 )
J Chem Inf Model - Design and synthesis of new antioxidants predicted by the model developed on a set of pulvinic acid derivatives. ( 0,668159990607083 )
Artif Intell Med - Fuzzy model identification of dengue epidemic in Colombia based on multiresolution analysis. ( 0,667312934048377 )
Med Biol Eng Comput - Application of the RIMARC algorithm to a large data set of action potentials and clinical parameters for risk prediction of atrial fibrillation. ( 0,664269966849588 )
J Chem Inf Model - Pharmacophore assessment through 3-D QSAR: evaluation of the predictive ability on new derivatives by the application on a series of antitubercular agents. ( 0,663396955971066 )
BMC Med Inform Decis Mak - Concordance and predictive value of two adverse drug event data sets. ( 0,662231081799929 )
J Chem Inf Model - Criterion for evaluating the predictive ability of nonlinear regression models without cross-validation. ( 0,661728902051535 )
AMIA Annu Symp Proc - Advanced proficiency EHR training: effect on physicians' EHR efficiency, EHR satisfaction and job satisfaction. ( 0,659440964242517 )
J. Comput. Biol. - Rich parameterization improves RNA structure prediction. ( 0,657407554010882 )
J Chem Inf Model - Template CoMFA: the 3D-QSAR Grail? ( 0,654802639040804 )
J Chem Inf Model - Combined receptor and ligand-based approach to the universal pharmacophore model development for studies of drug blockade to the hERG1 pore domain. ( 0,654741281725001 )
J Chem Inf Model - Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods. ( 0,654641827232459 )
J Chem Inf Model - How accurately can we predict the melting points of drug-like compounds? ( 0,65261501185079 )
J. Comput. Biol. - The complexity of the dirichlet model for multiple alignment data. ( 0,651070856626955 )
J Chem Inf Model - Estimation of carcinogenicity using molecular fragments tree. ( 0,650888461341078 )
IEEE Trans Image Process - Neighborhood Supported Model Level Fuzzy Aggregation for Moving Object Segmentation. ( 0,648072466972598 )
J Chem Inf Model - Rank order entropy: why one metric is not enough. ( 0,647429727019382 )
J Chem Inf Model - Experimental and computational prediction of glass transition temperature of drugs. ( 0,644142645276373 )
J Chem Inf Model - Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions. ( 0,639806607626051 )
AMIA Annu Symp Proc - Predicting the dengue incidence in Singapore using univariate time series models. ( 0,638718848152515 )
J Chem Inf Model - Oversampling to overcome overfitting: exploring the relationship between data set composition, molecular descriptors, and predictive modeling methods. ( 0,637736254935941 )
J Chem Inf Model - Applicability domains for classification problems: Benchmarking of distance to models for Ames mutagenicity set. ( 0,636462618900402 )
Int J Health Geogr - Incorporating geographical factors with artificial neural networks to predict reference values of erythrocyte sedimentation rate. ( 0,635473906446696 )
J Chem Inf Model - Automated building of organometallic complexes from 3D fragments. ( 0,632993092504159 )
Int J Comput Assist Radiol Surg - Optimized order estimation for autoregressive models to predict respiratory motion. ( 0,631459321484957 )
J Chem Inf Model - Predictions of BuChE inhibitors using support vector machine and naive Bayesian classification techniques in drug discovery. ( 0,631034704752308 )
J Chem Inf Model - Applicability domain based on ensemble learning in classification and regression analyses. ( 0,628373143173977 )
Comput Methods Programs Biomed - Kinetic modelling of haemodialysis removal of myoglobin in rhabdomyolysis patients. ( 0,62768592339787 )
J Chem Inf Model - Prediction of linear cationic antimicrobial peptides based on characteristics responsible for their interaction with the membranes. ( 0,626735703746555 )
J Chem Inf Model - Hsp90 inhibitors, part 1: definition of 3-D QSAutogrid/R models as a tool for virtual screening. ( 0,621553255522122 )
Neural Comput - Molecular diffusion model of neurotransmitter homeostasis around synapses supporting gradients. ( 0,619567362781112 )
Comput Methods Programs Biomed - Modeling the glucose regulatory system in extreme preterm infants. ( 0,618879925918606 )
J. Comput. Biol. - An almost optimal algorithm for generalized threshold group testing with inhibitors. ( 0,617480349666786 )
Comput. Biol. Med. - Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study. ( 0,61498003519664 )
Comput. Aided Surg. - Evaluation of a computational model to predict elbow range of motion. ( 0,612714910510698 )
J Chem Inf Model - Four-dimensional structure-activity relationship model to predict HIV-1 integrase strand transfer inhibition using LQTA-QSAR methodology. ( 0,61145908998161 )
IEEE Trans Vis Comput Graph - Model Synthesis: A General Procedural Modeling Algorithm. ( 0,611122539910374 )
Comput Methods Programs Biomed - Interstitial insulin kinetic parameters for a 2-compartment insulin model with saturable clearance. ( 0,609863530470233 )
J Am Med Inform Assoc - Harvest: an open platform for developing web-based biomedical data discovery and reporting applications. ( 0,60863917212632 )
J Chem Inf Model - Binary classification of aqueous solubility using support vector machines with reduction and recombination feature selection. ( 0,60776948275687 )
Med Biol Eng Comput - Accelerometry-based prediction of movement dynamics for balance monitoring. ( 0,607372989986582 )
J Chem Inf Model - A comparison of different QSAR approaches to modeling CYP450 1A2 inhibition. ( 0,606150575562624 )
IEEE Trans Image Process - Incremental N-mode SVD for large-scale multilinear generative models. ( 0,60548140877274 )
J Biomed Inform - MysiRNA: improving siRNA efficacy prediction using a machine-learning model combining multi-tools and whole stacking energy (G). ( 0,60474400310434 )
BMC Med Inform Decis Mak - Measuring preferences for analgesic treatment for cancer pain: how do African-Americans and Whites perform on choice-based conjoint (CBC) analysis experiments? ( 0,60313148217248 )
J Clin Monit Comput - Evaluation of a computer program for non-invasive determination of pulmonary shunt and ventilation-perfusion mismatch. ( 0,601565278422902 )
J Chem Inf Model - Optimizing predictive performance of CASE Ultra expert system models using the applicability domains of individual toxicity alerts. ( 0,599789688566164 )
J Chem Inf Model - Impact of template choice on homology model efficiency in virtual screening. ( 0,599649960536752 )
J. Med. Internet Res. - A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. ( 0,597868076940874 )
Comput Math Methods Med - Multiscale autoregressive identification of neuroelectrophysiological systems. ( 0,596056279909413 )
Med Biol Eng Comput - Optimal design of clinical tests for the identification of physiological models of type 1 diabetes in the presence of model mismatch. ( 0,595930880746488 )
Brief. Bioinformatics - An empirical assessment of validation practices for molecular classifiers. ( 0,595419991380653 )
J Chem Inf Model - Building a three-dimensional model of CYP2C9 inhibition using the Autocorrelator: an autonomous model generator. ( 0,593642615291419 )
Med Biol Eng Comput - Validating motor unit firing patterns extracted by EMG signal decomposition. ( 0,593612424801518 )
Int J Comput Assist Radiol Surg - Assessing performance in brain tumor resection using a novel virtual reality simulator. ( 0,592936361301893 )
J Chem Inf Model - How experimental errors influence drug metabolism and pharmacokinetic QSAR/QSPR models. ( 0,591136752466919 )
Comput Methods Programs Biomed - A therapy parameter-based model for predicting blood glucose concentrations in patients with type 1 diabetes. ( 0,590606250630406 )
J Chem Inf Model - Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets. ( 0,586810553806608 )
Int J Health Geogr - Comparative analysis of remotely-sensed data products via ecological niche modeling of avian influenza case occurrences in Middle Eastern poultry. ( 0,584874240407669 )
Neural Comput - A compartmental model of linear resonance and signal transfer in dendrites. ( 0,582745710658722 )
J Chem Inf Model - Ligand and structure-based classification models for prediction of P-glycoprotein inhibitors. ( 0,581445391503159 )
Med Decis Making - Developing a tuberculosis transmission model that accounts for changes in population health. ( 0,577704207788768 )
J Am Med Inform Assoc - Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery. ( 0,577041990077336 )
J Chem Inf Model - Predicting myelosuppression of drugs from in silico models. ( 0,575523119628795 )
Int J Comput Assist Radiol Surg - Hybrid image visualization tool for 3D integration of CT coronary anatomy and quantitative myocardial perfusion PET. ( 0,575153621012433 )
Comput. Biol. Med. - Artificial neural network modelling of the results of tympanoplasty in chronic suppurative otitis media patients. ( 0,573502300863164 )
J. Comput. Biol. - Boolean models can explain bistability in the lac operon. ( 0,573417104370261 )
J Chem Inf Model - Classifier ensemble based on feature selection and diversity measures for predicting the affinity of A(2B) adenosine receptor antagonists. ( 0,572776105192999 )
Lifetime Data Anal - Analysis of cure rate survival data under proportional odds model. ( 0,572170148202405 )
J Chem Inf Model - A multiscale simulation system for the prediction of drug-induced cardiotoxicity. ( 0,57029258107726 )
Spat Spatiotemporal Epidemiol - Spatial modelling of disease using data- and knowledge-driven approaches. ( 0,56547500412594 )
J Chem Inf Model - CSAR data set release 2012: ligands, affinities, complexes, and docking decoys. ( 0,563171253908007 )
Int J Health Geogr - A linear programming model for preserving privacy when disclosing patient spatial information for secondary purposes. ( 0,562535459169485 )