J Chem Inf Model - Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets.

Tópicos

{ model(2656) set(1616) predict(1553) }
{ bind(1733) structur(1185) ligand(1036) }
{ assess(1506) score(1403) qualiti(1306) }
{ learn(2355) train(1041) set(1003) }
{ patient(1821) servic(1111) care(1106) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ data(2317) use(1299) case(1017) }
{ sampl(1606) size(1419) use(1276) }
{ estim(2440) model(1874) function(577) }
{ model(3404) distribut(989) bayesian(671) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ sequenc(1873) structur(1644) protein(1328) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ take(945) account(800) differ(722) }
{ treatment(1704) effect(941) patient(846) }
{ chang(1828) time(1643) increas(1301) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ compound(1573) activ(1297) structur(1058) }
{ patient(2837) hospit(1953) medic(668) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

With the emergence of large collections of protein-ligand complexes complemented by binding data, as found in PDBbind or BindingMOAD, new opportunities for parametrizing and evaluating scoring functions have arisen. With huge data collections available, it becomes feasible to fit scoring functions in a QSAR style, i.e., by defining protein-ligand interaction descriptors and analyzing them with modern machine-learning methods. As in each data modeling ansatz, care has to be taken to validate the model carefully. Here, we show that there are large differences measured in R (0.77 vs 0.46) or R? (0.59 vs 0.21) for a relatively simple scoring function depending on whether it is validated against the PDBbind core set or validated in a leave-cluster-out cross-validation. If proteins from the same family are present in both the training and validation set, the estimated prediction quality from standard validation techniques looks too optimistic.

Resumo Limpo

emerg larg collect proteinligand complex complement bind data found pdbbind bindingmoad new opportun parametr evalu score function arisen huge data collect avail becom feasibl fit score function qsar style ie defin proteinligand interact descriptor analyz modern machinelearn method data model ansatz care taken valid model care show larg differ measur r vs r vs relat simpl score function depend whether valid pdbbind core set valid leaveclusterout crossvalid protein famili present train valid set estim predict qualiti standard valid techniqu look optimist

Resumos Similares

J Chem Inf Model - Robust scoring functions for protein-ligand interactions with quantum chemical charge models. ( 0,767616229696982 )
J Chem Inf Model - CSAR data set release 2012: ligands, affinities, complexes, and docking decoys. ( 0,761091749810148 )
J Chem Inf Model - Impact of template choice on homology model efficiency in virtual screening. ( 0,754790869099078 )
J Chem Inf Model - Building a three-dimensional model of CYP2C9 inhibition using the Autocorrelator: an autonomous model generator. ( 0,733260595442709 )
J Chem Inf Model - GRID-based three-dimensional pharmacophores II: PharmBench, a benchmark data set for evaluating pharmacophore elucidation methods. ( 0,719793488002839 )
J Chem Inf Model - iLOGP: a simple, robust, and efficient description of n-octanol/water partition coefficient for drug design using the GB/SA approach. ( 0,711811310813237 )
Artif Intell Med - Training artificial neural networks directly on the concordance index for censored data using genetic algorithms. ( 0,707191435232893 )
J Chem Inf Model - Four-dimensional structure-activity relationship model to predict HIV-1 integrase strand transfer inhibition using LQTA-QSAR methodology. ( 0,70439765933298 )
J Chem Inf Model - DrugPred: a structure-based approach to predict protein druggability developed using an extensive nonredundant data set. ( 0,696860307555156 )
J Chem Inf Model - Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations. ( 0,692357022726289 )
J Chem Inf Model - Time-split cross-validation as a method for estimating the goodness of prospective prediction. ( 0,683157948091937 )
J Chem Inf Model - Pharmacophore and 3D-QSAR characterization of 6-arylquinazolin-4-amines as Cdc2-like kinase 4 (Clk4) and dual specificity tyrosine-phosphorylation-regulated kinase 1A (Dyrk1A) inhibitors. ( 0,679484015087702 )
J Chem Inf Model - Comparative studies on some metrics for external validation of QSPR models. ( 0,674881562927068 )
AMIA Annu Symp Proc - Effect of data combination on predictive modeling: a study using gene expression data. ( 0,673742646921695 )
J Chem Inf Model - Predicting ligand binding modes from neural networks trained on protein-ligand interaction fingerprints. ( 0,673449414267381 )
J Chem Inf Model - Does rational selection of training and test sets improve the outcome of QSAR modeling? ( 0,66848761914136 )
J Biomed Inform - MysiRNA: improving siRNA efficacy prediction using a machine-learning model combining multi-tools and whole stacking energy (G). ( 0,661042265027468 )
J Chem Inf Model - In silico prediction of chemical Ames mutagenicity. ( 0,659963115270053 )
J Chem Inf Model - Study of chromatographic retention of natural terpenoids by chemoinformatic tools. ( 0,659817232195549 )
BMC Med Inform Decis Mak - Concordance and predictive value of two adverse drug event data sets. ( 0,659767097884184 )
J Chem Inf Model - Kinase-kernel models: accurate in silico screening of 4 million compounds across the entire human kinome. ( 0,657131857619524 )
J Chem Inf Model - Combined 3D-QSAR, molecular docking, and molecular dynamics study on piperazinyl-glutamate-pyridines/pyrimidines as potent P2Y12 antagonists for inhibition of platelet aggregation. ( 0,653132548468232 )
J Chem Inf Model - Rank order entropy: why one metric is not enough. ( 0,652178682077587 )
J Chem Inf Model - Docking-based comparative intermolecular contacts analysis as new 3-D QSAR concept for validating docking studies and in silico screening: NMT and GP inhibitors as case studies. ( 0,65173612518046 )
J. Comput. Biol. - The complexity of the dirichlet model for multiple alignment data. ( 0,650262167128598 )
J Chem Inf Model - RS-Predictor models augmented with SMARTCyp reactivities: robust metabolic regioselectivity predictions for nine CYP isozymes. ( 0,649271199371114 )
Comput Biol Chem - Homology modeling, binding site identification and docking in flavone hydroxylase CYP105P2 in Streptomyces peucetius ATCC 27952. ( 0,648950054910047 )
J Chem Inf Model - Development of novel 3D-QSAR combination approach for screening and optimizing B-Raf inhibitors in silico. ( 0,647319752224156 )
J Chem Inf Model - Uniting cheminformatics and chemical theory to predict the intrinsic aqueous solubility of crystalline druglike molecules. ( 0,640240198787291 )
J Chem Inf Model - Pharmacophore assessment through 3-D QSAR: evaluation of the predictive ability on new derivatives by the application on a series of antitubercular agents. ( 0,639345502722223 )
J Chem Inf Model - Extensive consensus docking evaluation for ligand pose prediction and virtual screening studies. ( 0,634527918809399 )
J Chem Inf Model - Introducing conformal prediction in predictive modeling. A transparent and flexible alternative to applicability domain determination. ( 0,632224107414662 )
Curr Comput Aided Drug Des - QSAR Models for the Reactivation of Sarin Inhibited AChE by Quaternary Pyridinium Oximes Based on Monte Carlo Method. ( 0,632195422817643 )
J Chem Inf Model - Beyond the scope of Free-Wilson analysis: building interpretable QSAR models with machine learning algorithms. ( 0,621992352554885 )
J Chem Inf Model - Ligand and structure-based classification models for prediction of P-glycoprotein inhibitors. ( 0,621383392381746 )
J Chem Inf Model - Ligand-steered modeling and docking: A benchmarking study in class A G-protein-coupled receptors. ( 0,61949328472919 )
J Chem Inf Model - Best of both worlds: combining pharma data and state of the art modeling technology to improve in Silico pKa prediction. ( 0,619140594692297 )
J Chem Inf Model - Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions. ( 0,618589574131742 )
J. Med. Internet Res. - A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. ( 0,617843351864812 )
J Chem Inf Model - Combined application of cheminformatics- and physical force field-based scoring functions improves binding affinity prediction for CSAR data sets. ( 0,617499336647062 )
AMIA Annu Symp Proc - Predicting the dengue incidence in Singapore using univariate time series models. ( 0,615000957470085 )
J Chem Inf Model - PHOENIX: a scoring function for affinity prediction derived using high-resolution crystal structures and calorimetry measurements. ( 0,614546208938322 )
J Chem Inf Model - Modeling, molecular dynamics simulation, and mutation validation for structure of cannabinoid receptor 2 based on known crystal structures of GPCRs. ( 0,61366252789934 )
Med Decis Making - Predicting EQ-5D utility scores from the Seattle Angina Questionnaire in coronary artery disease: a mapping algorithm using a Bayesian framework. ( 0,61359334868415 )
J Chem Inf Model - Modeling of open, closed, and open-inactivated states of the hERG1 channel: structural mechanisms of the state-dependent drug binding. ( 0,610686862205297 )
J Am Med Inform Assoc - Harvest: an open platform for developing web-based biomedical data discovery and reporting applications. ( 0,609884669457927 )
Comput Methods Programs Biomed - Kinetic modelling of haemodialysis removal of myoglobin in rhabdomyolysis patients. ( 0,608467860119852 )
Spat Spatiotemporal Epidemiol - Spatial modelling of disease using data- and knowledge-driven approaches. ( 0,608414784428815 )
J Chem Inf Model - Oversampling to overcome overfitting: exploring the relationship between data set composition, molecular descriptors, and predictive modeling methods. ( 0,608262262822189 )
J Chem Inf Model - Estimation of carcinogenicity using molecular fragments tree. ( 0,60489940513769 )
AMIA Annu Symp Proc - Motivating the additional use of external validity: examining transportability in a model of glioblastoma multiforme. ( 0,603893983771417 )
J Chem Inf Model - Predicting pK(a) values of substituted phenols from atomic charges: comparison of different quantum mechanical methods and charge distribution schemes. ( 0,602199467909395 )
Int J Comput Assist Radiol Surg - Assessing performance in brain tumor resection using a novel virtual reality simulator. ( 0,601446721464967 )
J Biomed Inform - Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: an application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson's Disease Questionnaire (PDQ-39). ( 0,598903309787689 )
Methods Inf Med - Quantifying changes in EEG complexity induced by photic stimulation. ( 0,596073653201215 )
Comput Biol Chem - A new protein graph model for function prediction. ( 0,595353282402581 )
J Chem Inf Model - Global free energy scoring functions based on distance-dependent atom-type pair descriptors. ( 0,593588279986187 )
J Chem Inf Model - CSAR benchmark exercise of 2010: combined evaluation across all submitted scoring functions. ( 0,593440255994228 )
J Chem Inf Model - Applicability domain based on ensemble learning in classification and regression analyses. ( 0,592470797875184 )
Comput. Aided Surg. - Evaluation of a computational model to predict elbow range of motion. ( 0,59242045157538 )
J Chem Inf Model - Hsp90 inhibitors, part 1: definition of 3-D QSAutogrid/R models as a tool for virtual screening. ( 0,590547102650715 )
J Chem Inf Model - Three-dimensional pharmacophore modeling of liver-X receptor agonists. ( 0,589976283336571 )
Int J Health Geogr - Incorporating geographical factors with artificial neural networks to predict reference values of erythrocyte sedimentation rate. ( 0,588638845036544 )
AMIA Annu Symp Proc - Advanced proficiency EHR training: effect on physicians' EHR efficiency, EHR satisfaction and job satisfaction. ( 0,587403313322981 )
J Chem Inf Model - Three useful dimensions for domain applicability in QSAR models using random forest. ( 0,586810553806608 )
J Chem Inf Model - Acetylcholinesterase inhibitors: structure based design, synthesis, pharmacophore modeling, and virtual screening. ( 0,586254189308123 )
J Chem Inf Model - New strategy for receptor-based pharmacophore query construction: a case study for 5-HT7 receptor ligands. ( 0,586060313807297 )
J Chem Inf Model - A machine learning-based method to improve docking scoring functions and its application to drug repurposing. ( 0,585728386931947 )
Comput Methods Programs Biomed - Bayesian bivariate generalized Lindley model for survival data with a cure fraction. ( 0,58525635564678 )
J Chem Inf Model - Classification of compounds with distinct or overlapping multi-target activities and diverse molecular mechanisms using emerging chemical patterns. ( 0,584593571963654 )
Int J Comput Assist Radiol Surg - MIDG-Emerging grid technologies for multi-site preclinical molecular imaging research communities. ( 0,584348175407599 )
BMC Med Inform Decis Mak - Regression tree construction by bootstrap: model search for DRG-systems applied to Austrian health-data. ( 0,584283869664304 )
J Chem Inf Model - Improving the scoring of protein-ligand binding affinity by including the effects of structural water and electronic polarization. ( 0,583690470618072 )
J Chem Inf Model - Real external predictivity of QSAR models. Part 2. New intercomparable thresholds for different validation criteria and the need for scatter plot inspection. ( 0,582632865310674 )
Neural Comput - Molecular diffusion model of neurotransmitter homeostasis around synapses supporting gradients. ( 0,581775712693135 )
J Chem Inf Model - Revisiting a receptor-based pharmacophore hypothesis for human A(2A) adenosine receptor antagonists. ( 0,579717810654036 )
Artif Intell Med - Improving predictive models of glaucoma severity by incorporating quality indicators. ( 0,575317755486223 )
J Chem Inf Model - Analyzing the topology of active sites: on the prediction of pockets and subpockets. ( 0,575005947144832 )
J Chem Inf Model - Subangstrom accuracy in pHLA-I modeling by Rosetta FlexPepDock refinement protocol. ( 0,574579126660041 )
Comput Methods Programs Biomed - A predictive model of longitudinal, patient-specific colonoscopy results. ( 0,573943030343803 )
J. Comput. Biol. - Boosting prediction performance of protein-protein interaction hot spots by using structural neighborhood properties. ( 0,568378319026088 )
J Chem Inf Model - A multiscale simulation system for the prediction of drug-induced cardiotoxicity. ( 0,564249711262917 )
J Chem Inf Model - Prediction of linear cationic antimicrobial peptides based on characteristics responsible for their interaction with the membranes. ( 0,562505831415512 )
J Chem Inf Model - Criterion for evaluating the predictive ability of nonlinear regression models without cross-validation. ( 0,5597638696614 )
Artif Intell Med - Fuzzy model identification of dengue epidemic in Colombia based on multiresolution analysis. ( 0,558377080145284 )
J Chem Inf Model - In silico prediction of aqueous solubility using simple QSPR models: the importance of phenol and phenol-like moieties. ( 0,558304402765636 )
Med Decis Making - Developing a tuberculosis transmission model that accounts for changes in population health. ( 0,557527488886962 )
J Chem Inf Model - Potency prediction of ?-secretase (BACE-1) inhibitors using density functional methods. ( 0,555261588294861 )
J Chem Inf Model - Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods. ( 0,552462544473474 )
Comput Biol Chem - Targeting the Akt1 allosteric site to identify novel scaffolds through virtual screening. ( 0,551825405536469 )
J Chem Inf Model - Docking covalent inhibitors: a parameter free approach to pose prediction and scoring. ( 0,550754466530187 )
J Chem Inf Model - In silico prediction of total human plasma clearance. ( 0,549513915911792 )
J Chem Inf Model - Toward an optimal docking and free energy calculation scheme in ligand design with application to COX-1 inhibitors. ( 0,546301351241269 )
J Chem Inf Model - SiteBinder: an improved approach for comparing multiple protein structural motifs. ( 0,544693350877355 )
J Chem Inf Model - Kernel-based partial least squares: application to fingerprint-based QSAR with model visualization. ( 0,544566313197356 )
J Chem Inf Model - Improved docking of polypeptides with Glide. ( 0,544512262191992 )
J Chem Inf Model - Structure-based prediction of subtype selectivity of histamine H3 receptor selective antagonists in clinical trials. ( 0,543934792630434 )
Comput. Biol. Med. - Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study. ( 0,54389048690153 )
Med Decis Making - Prediction of health preference values from CD4 counts in individuals with HIV. ( 0,543537512055952 )
J Am Med Inform Assoc - Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery. ( 0,54278394097479 )