J Chem Inf Model - Best of both worlds: combining pharma data and state of the art modeling technology to improve in Silico pKa prediction.

Tópicos

{ model(2656) set(1616) predict(1553) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ measur(2081) correl(1212) valu(896) }
{ error(1145) method(1030) estim(1020) }
{ compound(1573) activ(1297) structur(1058) }
{ network(2748) neural(1063) input(814) }
{ studi(2440) review(1878) systemat(933) }
{ algorithm(1844) comput(1787) effici(935) }
{ control(1307) perform(991) simul(935) }
{ sampl(1606) size(1419) use(1276) }
{ sequenc(1873) structur(1644) protein(1328) }
{ take(945) account(800) differ(722) }
{ assess(1506) score(1403) qualiti(1306) }
{ model(2220) cell(1177) simul(1124) }
{ featur(1941) imag(1645) propos(1176) }
{ group(2977) signific(1463) compar(1072) }
{ time(1939) patient(1703) rate(768) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ can(774) often(719) complex(702) }
{ bind(1733) structur(1185) ligand(1036) }
{ design(1359) user(1324) use(1319) }
{ howev(809) still(633) remain(590) }
{ model(2341) predict(2261) use(1141) }
{ studi(1119) effect(1106) posit(819) }
{ spatial(1525) area(1432) region(1030) }
{ health(3367) inform(1360) care(1135) }
{ monitor(1329) mobil(1314) devic(1160) }
{ research(1218) medic(880) student(794) }
{ data(2317) use(1299) case(1017) }
{ first(2504) two(1366) second(1323) }
{ can(981) present(881) function(850) }
{ use(976) code(926) identifi(902) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ method(1219) similar(1157) match(930) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ extract(1171) text(1153) clinic(932) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

In a unique collaboration between a software company and a pharmaceutical company, we were able to develop a new in silico pKa prediction tool with outstanding prediction quality. An existing pKa prediction method from Simulations Plus based on artificial neural network ensembles (ANNE), microstates analysis, and literature data was retrained with a large homogeneous data set of drug-like molecules from Bayer. The new model was thus built with curated sets of ~14,000 literature pKa values (~11,000 compounds, representing literature chemical space) and ~19,500 pKa values experimentally determined at Bayer Pharma (~16,000 compounds, representing industry chemical space). Model validation was performed with several test sets consisting of a total of ~31,000 new pKa values measured at Bayer. For the largest and most difficult test set with >16,000 pKa values that were not used for training, the original model achieved a mean absolute error (MAE) of 0.72, root-mean-square error (RMSE) of 0.94, and squared correlation coefficient (R(2)) of 0.87. The new model achieves significantly improved prediction statistics, with MAE = 0.50, RMSE = 0.67, and R(2) = 0.93. It is commercially available as part of the Simulations Plus ADMET Predictor release 7.0. Good predictions are only of value when delivered effectively to those who can use them. The new pKa prediction model has been integrated into Pipeline Pilot and the PharmacophorInformatics (PIx) platform used by scientists at Bayer Pharma. Different output formats allow customized application by medicinal chemists, physical chemists, and computational chemists.

Resumo Limpo

uniqu collabor softwar compani pharmaceut compani abl develop new silico pka predict tool outstand predict qualiti exist pka predict method simul plus base artifici neural network ensembl ann microst analysi literatur data retrain larg homogen data set druglik molecul bayer new model thus built curat set literatur pka valu compound repres literatur chemic space pka valu experiment determin bayer pharma compound repres industri chemic space model valid perform sever test set consist total new pka valu measur bayer largest difficult test set pka valu use train origin model achiev mean absolut error mae rootmeansquar error rmse squar correl coeffici r new model achiev signific improv predict statist mae rmse r commerci avail part simul plus admet predictor releas good predict valu deliv effect can use new pka predict model integr pipelin pilot pharmacophorinformat pix platform use scientist bayer pharma differ output format allow custom applic medicin chemist physic chemist comput chemist

Resumos Similares

J Chem Inf Model - Real external predictivity of QSAR models: how to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient. ( 0,784092746229628 )
J Chem Inf Model - Time-split cross-validation as a method for estimating the goodness of prospective prediction. ( 0,730425178435515 )
Int J Health Geogr - Incorporating geographical factors with artificial neural networks to predict reference values of erythrocyte sedimentation rate. ( 0,703115527332537 )
J Chem Inf Model - GRID-based three-dimensional pharmacophores II: PharmBench, a benchmark data set for evaluating pharmacophore elucidation methods. ( 0,693481017837667 )
Comput. Aided Surg. - Evaluation of a computational model to predict elbow range of motion. ( 0,690648441294965 )
Artif Intell Med - Training artificial neural networks directly on the concordance index for censored data using genetic algorithms. ( 0,686395869700409 )
J Chem Inf Model - Pharmacophore assessment through 3-D QSAR: evaluation of the predictive ability on new derivatives by the application on a series of antitubercular agents. ( 0,683067709419748 )
BMC Med Inform Decis Mak - Concordance and predictive value of two adverse drug event data sets. ( 0,682942044304539 )
J Chem Inf Model - Three useful dimensions for domain applicability in QSAR models using random forest. ( 0,679317023932821 )
J Chem Inf Model - Development of novel 3D-QSAR combination approach for screening and optimizing B-Raf inhibitors in silico. ( 0,678823544199408 )
J. Comput. Biol. - The complexity of the dirichlet model for multiple alignment data. ( 0,678386130834518 )
J Chem Inf Model - iLOGP: a simple, robust, and efficient description of n-octanol/water partition coefficient for drug design using the GB/SA approach. ( 0,674679951570702 )
J Chem Inf Model - Predicting pK(a) values of substituted phenols from atomic charges: comparison of different quantum mechanical methods and charge distribution schemes. ( 0,668176308506979 )
J Chem Inf Model - Beyond the scope of Free-Wilson analysis: building interpretable QSAR models with machine learning algorithms. ( 0,666071915788368 )
AMIA Annu Symp Proc - Effect of data combination on predictive modeling: a study using gene expression data. ( 0,66408421901121 )
J Chem Inf Model - Does rational selection of training and test sets improve the outcome of QSAR modeling? ( 0,66316586819265 )
J Chem Inf Model - Study of chromatographic retention of natural terpenoids by chemoinformatic tools. ( 0,662749787668662 )
J Chem Inf Model - Molecular modeling of the 3D structure of 5-HT(1A)R: discovery of novel 5-HT(1A)R agonists via dynamic pharmacophore-based virtual screening. ( 0,6612993827701 )
AMIA Annu Symp Proc - Predicting the dengue incidence in Singapore using univariate time series models. ( 0,661030814651583 )
J Am Med Inform Assoc - Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery. ( 0,653643986598878 )
Med Biol Eng Comput - Development of a comprehensive musculoskeletal model of the shoulder and elbow. ( 0,651570070579759 )
Int J Health Geogr - Comparative analysis of remotely-sensed data products via ecological niche modeling of avian influenza case occurrences in Middle Eastern poultry. ( 0,651280145165388 )
J Chem Inf Model - QSAR modeling of imbalanced high-throughput screening data in PubChem. ( 0,641948900914901 )
J Chem Inf Model - Statistical analysis and compound selection of combinatorial libraries for soluble epoxide hydrolase. ( 0,639496721254433 )
BMC Med Inform Decis Mak - Regression tree construction by bootstrap: model search for DRG-systems applied to Austrian health-data. ( 0,637789742726082 )
Int J Comput Assist Radiol Surg - Hybrid image visualization tool for 3D integration of CT coronary anatomy and quantitative myocardial perfusion PET. ( 0,634213717464149 )
J Chem Inf Model - RS-Predictor models augmented with SMARTCyp reactivities: robust metabolic regioselectivity predictions for nine CYP isozymes. ( 0,634125317288223 )
J Chem Inf Model - Molecular dynamics simulation and binding energy calculation for estimation of oligonucleotide duplex thermostability in RNA-based therapeutics. ( 0,625352829191382 )
Artif Intell Med - Fuzzy model identification of dengue epidemic in Colombia based on multiresolution analysis. ( 0,625273739016424 )
AMIA Annu Symp Proc - Motivating the additional use of external validity: examining transportability in a model of glioblastoma multiforme. ( 0,624862548575682 )
J Chem Inf Model - Hsp90 inhibitors, part 1: definition of 3-D QSAutogrid/R models as a tool for virtual screening. ( 0,622981824744825 )
J Chem Inf Model - Design and synthesis of new antioxidants predicted by the model developed on a set of pulvinic acid derivatives. ( 0,621309948962406 )
J Chem Inf Model - Leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets. ( 0,619140594692297 )
Comput. Biol. Med. - Artificial neural network modelling of the results of tympanoplasty in chronic suppurative otitis media patients. ( 0,616486366610158 )
J Chem Inf Model - Analysis and study of molecule data sets using snowflake diagrams of weighted maximum common subgraph trees. ( 0,615381404361172 )
Int J Med Inform - Design and implementation of I2Vote--an interactive image-based voting system using windows mobile devices. ( 0,615027955747062 )
J Chem Inf Model - Rank order entropy: why one metric is not enough. ( 0,613639497673491 )
J Chem Inf Model - In silico prediction of aqueous solubility using simple QSPR models: the importance of phenol and phenol-like moieties. ( 0,612969155572692 )
J Chem Inf Model - Classification of compounds with distinct or overlapping multi-target activities and diverse molecular mechanisms using emerging chemical patterns. ( 0,612396846133119 )
J Chem Inf Model - Impact of template choice on homology model efficiency in virtual screening. ( 0,611021202922961 )
J Chem Inf Model - In silico prediction of total human plasma clearance. ( 0,610507308209872 )
J Chem Inf Model - Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions. ( 0,610452745319486 )
J Chem Inf Model - CSAR data set release 2012: ligands, affinities, complexes, and docking decoys. ( 0,60900363215893 )
J Chem Inf Model - How accurately can we predict the melting points of drug-like compounds? ( 0,607864038027539 )
J Chem Inf Model - Experimental and computational prediction of glass transition temperature of drugs. ( 0,607415648144127 )
Comput Math Methods Med - Multiscale autoregressive identification of neuroelectrophysiological systems. ( 0,602579148109363 )
J Chem Inf Model - Building a three-dimensional model of CYP2C9 inhibition using the Autocorrelator: an autonomous model generator. ( 0,600905882540644 )
J Chem Inf Model - Comparative studies on some metrics for external validation of QSPR models. ( 0,600627026334123 )
J Chem Inf Model - Four-dimensional structure-activity relationship model to predict HIV-1 integrase strand transfer inhibition using LQTA-QSAR methodology. ( 0,597895692991132 )
J Chem Inf Model - Coping with unbalanced class data sets in oral absorption models. ( 0,597417594396654 )
Comput Methods Programs Biomed - Kinetic modelling of haemodialysis removal of myoglobin in rhabdomyolysis patients. ( 0,597130904113 )
J Chem Inf Model - Design of novel FLT-3 inhibitors based on dual-layer 3D-QSAR model and fragment-based compounds in silico. ( 0,596846494964278 )
J Chem Inf Model - Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods. ( 0,596445249966904 )
J Chem Inf Model - Applicability domains for classification problems: Benchmarking of distance to models for Ames mutagenicity set. ( 0,596017647213229 )
J Am Med Inform Assoc - Harvest: an open platform for developing web-based biomedical data discovery and reporting applications. ( 0,594534300466736 )
Med Decis Making - Prediction of health preference values from CD4 counts in individuals with HIV. ( 0,594437092741645 )
AMIA Annu Symp Proc - Advanced proficiency EHR training: effect on physicians' EHR efficiency, EHR satisfaction and job satisfaction. ( 0,59367773151533 )
J Chem Inf Model - In silico prediction of chemical Ames mutagenicity. ( 0,5925388755612 )
IEEE Trans Image Process - Incremental N-mode SVD for large-scale multilinear generative models. ( 0,591944239683225 )
Comput. Biol. Med. - A prediction model of substrates and non-substrates of breast cancer resistance protein (BCRP) developed by GA-CG-SVM method. ( 0,591362025463241 )
Int J Comput Assist Radiol Surg - Assessing performance in brain tumor resection using a novel virtual reality simulator. ( 0,590147748799177 )
J Chem Inf Model - Oversampling to overcome overfitting: exploring the relationship between data set composition, molecular descriptors, and predictive modeling methods. ( 0,588983606639588 )
J Chem Inf Model - Automated building of organometallic complexes from 3D fragments. ( 0,588421772512358 )
J Biomed Inform - MysiRNA: improving siRNA efficacy prediction using a machine-learning model combining multi-tools and whole stacking energy (G). ( 0,587094141958576 )
J Chem Inf Model - Benchmarking study of parameter variation when using signature fingerprints together with support vector machines. ( 0,586854039572514 )
J Chem Inf Model - Real external predictivity of QSAR models. Part 2. New intercomparable thresholds for different validation criteria and the need for scatter plot inspection. ( 0,586015738790755 )
J Chem Inf Model - Predictions of BuChE inhibitors using support vector machine and naive Bayesian classification techniques in drug discovery. ( 0,585306268760344 )
Comput Methods Programs Biomed - Predicting body fat percentage based on gender, age and BMI by using artificial neural networks. ( 0,580389598431201 )
Curr Comput Aided Drug Des - QSAR Models for the Reactivation of Sarin Inhibited AChE by Quaternary Pyridinium Oximes Based on Monte Carlo Method. ( 0,579980387612418 )
J Clin Monit Comput - Evaluation of a computer program for non-invasive determination of pulmonary shunt and ventilation-perfusion mismatch. ( 0,57982300191323 )
J Chem Inf Model - A comparison of different QSAR approaches to modeling CYP450 1A2 inhibition. ( 0,579429591102412 )
J Chem Inf Model - Binary classification of aqueous solubility using support vector machines with reduction and recombination feature selection. ( 0,57455355179869 )
J Chem Inf Model - Predicting myelosuppression of drugs from in silico models. ( 0,572894803598918 )
J Chem Inf Model - Combined 3D-QSAR, molecular docking, and molecular dynamics study on piperazinyl-glutamate-pyridines/pyrimidines as potent P2Y12 antagonists for inhibition of platelet aggregation. ( 0,572878009000665 )
Spat Spatiotemporal Epidemiol - Spatial modelling of disease using data- and knowledge-driven approaches. ( 0,570172165751455 )
J Am Med Inform Assoc - Reconciliation of the cloud computing model with US federal electronic health record regulations. ( 0,569236288678556 )
J Chem Inf Model - Kinase-kernel models: accurate in silico screening of 4 million compounds across the entire human kinome. ( 0,564248876463055 )
J Chem Inf Model - A new approach to radial basis function approximation and its application to QSAR. ( 0,560189036634604 )
J Chem Inf Model - Robust scoring functions for protein-ligand interactions with quantum chemical charge models. ( 0,559013156534433 )
BMC Med Inform Decis Mak - Measuring preferences for analgesic treatment for cancer pain: how do African-Americans and Whites perform on choice-based conjoint (CBC) analysis experiments? ( 0,555289476396903 )
Int J Comput Assist Radiol Surg - Optimized order estimation for autoregressive models to predict respiratory motion. ( 0,554763849020574 )
J. Med. Internet Res. - A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. ( 0,552963051028356 )
J Chem Inf Model - Using random forest to model the domain applicability of another random forest model. ( 0,551551711511461 )
J Chem Inf Model - Prediction of linear cationic antimicrobial peptides based on characteristics responsible for their interaction with the membranes. ( 0,54921941763887 )
Brief. Bioinformatics - Rediscovery rate estimation for assessing the validation of significant findings in high-throughput studies. ( 0,547696161566949 )
J Chem Inf Model - Calculation of aqueous solubility of crystalline un-ionized organic chemicals and drugs based on structural similarity and physicochemical descriptors. ( 0,547069894824052 )
Lifetime Data Anal - Analysis of cure rate survival data under proportional odds model. ( 0,546634405736363 )
Comput Biol Chem - Monte Carlo-based rigid body modelling of large protein complexes against small angle scattering data. ( 0,542702066942431 )
J Chem Inf Model - Profile-QSAR and Surrogate AutoShim protein-family modeling of proteases. ( 0,539896377605043 )
J. Comput. Biol. - Boolean models can explain bistability in the lac operon. ( 0,53957711414424 )
J. Comput. Biol. - An almost optimal algorithm for generalized threshold group testing with inhibitors. ( 0,539414256949683 )
J Chem Inf Model - Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity. ( 0,537607840910696 )
Int J Health Geogr - A linear programming model for preserving privacy when disclosing patient spatial information for secondary purposes. ( 0,537468563403209 )
Med Biol Eng Comput - Share and enjoy: anatomical models database--generating and sharing cardiovascular model data using web services. ( 0,53735008566576 )
Artif Intell Med - Image partitioning and illumination in image-based pose detection for teleoperated flexible endoscopes. ( 0,534324782861036 )
J Chem Inf Model - Viscosity of ionic liquids: an extensive database and a new group contribution model based on a feed-forward artificial neural network. ( 0,532982379203663 )
Comput. Biol. Med. - Comprehension of drug toxicity: software and databases. ( 0,532834268966843 )
Comput Biol Chem - Homology modeling, binding site identification and docking in flavone hydroxylase CYP105P2 in Streptomyces peucetius ATCC 27952. ( 0,532686343765622 )
J Chem Inf Model - Predictive models for cytochrome p450 isozymes based on quantitative high throughput screening data. ( 0,532296045259832 )
Comput. Biol. Med. - Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study. ( 0,532056150362085 )