J Chem Inf Model - Comparison of random forest and Pipeline Pilot Na?ve Bayes in prospective QSAR predictions.

Tópicos

{ model(2656) set(1616) predict(1553) }
{ model(2341) predict(2261) use(1141) }
{ howev(809) still(633) remain(590) }
{ imag(2830) propos(1344) filter(1198) }
{ compound(1573) activ(1297) structur(1058) }
{ control(1307) perform(991) simul(935) }
{ group(2977) signific(1463) compar(1072) }
{ implement(1333) system(1263) develop(1122) }
{ model(3480) simul(1196) paramet(876) }
{ activ(1452) weight(1219) physic(1104) }
{ can(774) often(719) complex(702) }
{ featur(3375) classif(2383) classifi(1994) }
{ featur(1941) imag(1645) propos(1176) }
{ method(2212) result(1239) propos(1039) }
{ method(1219) similar(1157) match(930) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ problem(2511) optim(1539) algorithm(950) }
{ learn(2355) train(1041) set(1003) }
{ general(901) number(790) one(736) }
{ case(1353) use(1143) diagnosi(1136) }
{ monitor(1329) mobil(1314) devic(1160) }
{ state(1844) use(1261) util(961) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ activ(1138) subject(705) human(624) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Random forest is currently considered one of the best QSAR methods available in terms of accuracy of prediction. However, it is computationally intensive. Na?ve Bayes is a simple, robust classification method. The Laplacian-modified Na?ve Bayes implementation is the preferred QSAR method in the widely used commercial chemoinformatics platform Pipeline Pilot. We made a comparison of the ability of Pipeline Pilot Na?ve Bayes (PLPNB) and random forest to make accurate predictions on 18 large, diverse in-house QSAR data sets. These include on-target and ADME-related activities. These data sets were set up as classification problems with either binary or multicategory activities. We used a time-split method of dividing training and test sets, as we feel this is a realistic way of simulating prospective prediction. PLPNB is computationally efficient. However, random forest predictions are at least as good and in many cases significantly better than those of PLPNB on our data sets. PLPNB performs better with ECFP4 and ECFP6 descriptors, which are native to Pipeline Pilot, and more poorly with other descriptors we tried.

Resumo Limpo

random forest current consid one best qsar method avail term accuraci predict howev comput intens nave bay simpl robust classif method laplacianmodifi nave bay implement prefer qsar method wide use commerci chemoinformat platform pipelin pilot made comparison abil pipelin pilot nave bay plpnb random forest make accur predict larg divers inhous qsar data set includ ontarget admerel activ data set set classif problem either binari multicategori activ use timesplit method divid train test set feel realist way simul prospect predict plpnb comput effici howev random forest predict least good mani case signific better plpnb data set plpnb perform better ecfp ecfp descriptor nativ pipelin pilot poor descriptor tri

Resumos Similares

J Chem Inf Model - Binary classification of aqueous solubility using support vector machines with reduction and recombination feature selection. ( 0,736340997385449 )
J Chem Inf Model - Experimental and computational prediction of glass transition temperature of drugs. ( 0,73010828615943 )
BMC Med Inform Decis Mak - Regression tree construction by bootstrap: model search for DRG-systems applied to Austrian health-data. ( 0,685045639083756 )
Comput Math Methods Med - Screening for prediabetes using machine learning models. ( 0,670014857890975 )
Med Decis Making - Constructing proper ROCs from ordinal response data using weighted power functions. ( 0,660222735349228 )
Int J Health Geogr - Prediction of high-risk areas for visceral leishmaniasis using socioeconomic indicators and remote sensing data. ( 0,651171160681633 )
J Chem Inf Model - Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis. ( 0,643769319857833 )
J Chem Inf Model - Analysis and study of molecule data sets using snowflake diagrams of weighted maximum common subgraph trees. ( 0,638837428736078 )
J Med Syst - Utilization of electronic medical records to build a detection model for surveillance of healthcare-associated urinary tract infections. ( 0,631468630285088 )
J Chem Inf Model - Quantitative structure-activity relationship models for ready biodegradability of chemicals. ( 0,629345834928179 )
Med Decis Making - Performance of a mathematical model to forecast lives saved from HIV treatment expansion in resource-limited settings. ( 0,616220212542233 )
J Chem Inf Model - Does rational selection of training and test sets improve the outcome of QSAR modeling? ( 0,610357676475417 )
J Chem Inf Model - FAst MEtabolizer (FAME): A rapid and accurate predictor of sites of metabolism in multiple species by endogenous enzymes. ( 0,609514299797993 )
Comput. Biol. Med. - Artificial neural network modelling of the results of tympanoplasty in chronic suppurative otitis media patients. ( 0,608456494205225 )
Med Biol Eng Comput - Application of the RIMARC algorithm to a large data set of action potentials and clinical parameters for risk prediction of atrial fibrillation. ( 0,605386449053437 )
J Chem Inf Model - Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods. ( 0,600843527324144 )
J Chem Inf Model - A new approach to radial basis function approximation and its application to QSAR. ( 0,598086706264218 )
BMC Med Inform Decis Mak - Bayesian predictors of very poor health related quality of life and mortality in patients with COPD. ( 0,59565749353554 )
J Chem Inf Model - Ligand efficiency-based support vector regression models for predicting bioactivities of ligands to drug target proteins. ( 0,592995656911118 )
J Chem Inf Model - Impact of template choice on homology model efficiency in virtual screening. ( 0,590882052138332 )
J Chem Inf Model - Statistical analysis and compound selection of combinatorial libraries for soluble epoxide hydrolase. ( 0,588350528599701 )
IEEE Trans Pattern Anal Mach Intell - Understanding Blind Deconvolution Algorithms. ( 0,588029986536069 )
J Chem Inf Model - Pharmacophore assessment through 3-D QSAR: evaluation of the predictive ability on new derivatives by the application on a series of antitubercular agents. ( 0,587787016268554 )
J Chem Inf Model - iLOGP: a simple, robust, and efficient description of n-octanol/water partition coefficient for drug design using the GB/SA approach. ( 0,586671921673575 )
J Chem Inf Model - In silico prediction of total human plasma clearance. ( 0,583410420303591 )
Comput Math Methods Med - Variable selection in ROC regression. ( 0,583045106765219 )
J Chem Inf Model - Classification of compounds with distinct or overlapping multi-target activities and diverse molecular mechanisms using emerging chemical patterns. ( 0,575546442807194 )
Med Decis Making - Adaptation of clinical prediction models for application in local settings. ( 0,575139576285739 )
Comput. Biol. Med. - A leave-one-out cross-validation SAS macro for the identification of markers associated with survival. ( 0,57443434009739 )
J Chem Inf Model - Time-split cross-validation as a method for estimating the goodness of prospective prediction. ( 0,568608996919628 )
J Chem Inf Model - Prediction of compound potency changes in matched molecular pairs using support vector regression. ( 0,567461825156132 )
Artif Intell Med - Training artificial neural networks directly on the concordance index for censored data using genetic algorithms. ( 0,567425396780971 )
J Chem Inf Model - Hsp90 inhibitors, part 1: definition of 3-D QSAutogrid/R models as a tool for virtual screening. ( 0,565672335162354 )
BMC Med Inform Decis Mak - Filtering data from the collaborative initial glaucoma treatment study for improved identification of glaucoma progression. ( 0,564690251005247 )
J Chem Inf Model - GRID-based three-dimensional pharmacophores II: PharmBench, a benchmark data set for evaluating pharmacophore elucidation methods. ( 0,563917547189571 )
J Chem Inf Model - Coping with unbalanced class data sets in oral absorption models. ( 0,559469754343063 )
AMIA Annu Symp Proc - Effect of data combination on predictive modeling: a study using gene expression data. ( 0,559303999116093 )
J Chem Inf Model - Predictive toxicology modeling: protocols for exploring hERG classification and Tetrahymena pyriformis end point predictions. ( 0,558246585630382 )
J Chem Inf Model - Predictions of BuChE inhibitors using support vector machine and naive Bayesian classification techniques in drug discovery. ( 0,557897914927119 )
J Chem Inf Model - Three useful dimensions for domain applicability in QSAR models using random forest. ( 0,557116210002228 )
J Chem Inf Model - Using random forest to model the domain applicability of another random forest model. ( 0,55648466642585 )
J Chem Inf Model - Study of chromatographic retention of natural terpenoids by chemoinformatic tools. ( 0,556309121580432 )
J Chem Inf Model - Design and synthesis of new antioxidants predicted by the model developed on a set of pulvinic acid derivatives. ( 0,555114264648027 )
AMIA Annu Symp Proc - Motivating the additional use of external validity: examining transportability in a model of glioblastoma multiforme. ( 0,554895985533271 )
BMC Med Inform Decis Mak - Measuring preferences for analgesic treatment for cancer pain: how do African-Americans and Whites perform on choice-based conjoint (CBC) analysis experiments? ( 0,554363428540884 )
J Chem Inf Model - Oversampling to overcome overfitting: exploring the relationship between data set composition, molecular descriptors, and predictive modeling methods. ( 0,552180929858917 )
J Chem Inf Model - Beyond the scope of Free-Wilson analysis: building interpretable QSAR models with machine learning algorithms. ( 0,551424388979893 )
J Chem Inf Model - In silico prediction of aqueous solubility using simple QSPR models: the importance of phenol and phenol-like moieties. ( 0,550929383450959 )
J Chem Inf Model - Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions. ( 0,550763301715119 )
J Chem Inf Model - Design of novel FLT-3 inhibitors based on dual-layer 3D-QSAR model and fragment-based compounds in silico. ( 0,548586621172435 )
Artif Intell Med - NICeSim: an open-source simulator based on machine learning techniques to support medical research on prenatal and perinatal care decision making. ( 0,548112955295552 )
J. Comput. Biol. - The complexity of the dirichlet model for multiple alignment data. ( 0,547927602797375 )
BMC Med Inform Decis Mak - Concordance and predictive value of two adverse drug event data sets. ( 0,547509146047406 )
J Chem Inf Model - RS-Predictor models augmented with SMARTCyp reactivities: robust metabolic regioselectivity predictions for nine CYP isozymes. ( 0,545447196671908 )
J Biomed Inform - MysiRNA: improving siRNA efficacy prediction using a machine-learning model combining multi-tools and whole stacking energy (G). ( 0,544346414058434 )
Comput Math Methods Med - SNP selection in genome-wide association studies via penalized support vector machine with MAX test. ( 0,544340386929133 )
Comput. Biol. Med. - Three dimensional quantitative structure-toxicity relationship modeling and prediction of acute toxicity for organic contaminants to algae. ( 0,543951753270258 )
Int J Health Geogr - Comparative analysis of remotely-sensed data products via ecological niche modeling of avian influenza case occurrences in Middle Eastern poultry. ( 0,543505395788907 )
J Chem Inf Model - A Bayesian approach to in silico blood-brain barrier penetration modeling. ( 0,542819635191584 )
Artif Intell Med - A machine learning-based approach to prognostic analysis of thoracic transplantations. ( 0,533980716650031 )
J Chem Inf Model - Predictive models for cytochrome p450 isozymes based on quantitative high throughput screening data. ( 0,532939133763097 )
J Chem Inf Model - Best of both worlds: combining pharma data and state of the art modeling technology to improve in Silico pKa prediction. ( 0,531780941933047 )
J Chem Inf Model - How accurately can we predict the melting points of drug-like compounds? ( 0,528946564077855 )
J Chem Inf Model - Automated building of organometallic complexes from 3D fragments. ( 0,525514729739112 )
J Chem Inf Model - Predicting pK(a) values of substituted phenols from atomic charges: comparison of different quantum mechanical methods and charge distribution schemes. ( 0,522213790372963 )
J Chem Inf Model - Applicability domain based on ensemble learning in classification and regression analyses. ( 0,5199927855672 )
J Chem Inf Model - Comparative studies on some metrics for external validation of QSPR models. ( 0,519762105242685 )
J Chem Inf Model - Profile-QSAR and Surrogate AutoShim protein-family modeling of proteases. ( 0,518347779357135 )
J. Comput. Biol. - Prediction of siRNA potency using sparse logistic regression. ( 0,51811007055549 )
J Chem Inf Model - Rank order entropy: why one metric is not enough. ( 0,517299731823057 )
Med Biol Eng Comput - Validating motor unit firing patterns extracted by EMG signal decomposition. ( 0,516952421205086 )
J Biomed Inform - An empirical approach to model selection through validation for censored survival data. ( 0,516094649964495 )
Artif Intell Med - Predicting the need for CT imaging in children with minor head injury using an ensemble of Naive Bayes classifiers. ( 0,515071197394947 )
Comput. Biol. Med. - A knowledge-driven probabilistic framework for the prediction of protein-protein interaction networks. ( 0,512712721992025 )
J Chem Inf Model - Using information from historical high-throughput screens to predict active compounds. ( 0,511356063872082 )
J Chem Inf Model - Robust scoring functions for protein-ligand interactions with quantum chemical charge models. ( 0,509607104677389 )
Comput. Biol. Med. - Cholesteryl ester transfer protein inhibitors in coronary heart disease: Validated comparative QSAR modeling of N, N-disubstituted trifluoro-3-amino-2-propanols. ( 0,508670173227914 )
J Chem Inf Model - Two new parameters based on distances in a receiver operating characteristic chart for the selection of classification models. ( 0,507548937998939 )
J Chem Inf Model - CSAR data set release 2012: ligands, affinities, complexes, and docking decoys. ( 0,507086639324776 )
J Chem Inf Model - In silico prediction of chemical Ames mutagenicity. ( 0,506840999115359 )
J Chem Inf Model - Fusing dual-event data sets for Mycobacterium tuberculosis machine learning models and their evaluation. ( 0,506662358707704 )
AMIA Annu Symp Proc - Predicting the dengue incidence in Singapore using univariate time series models. ( 0,506615871431649 )
Int J Comput Assist Radiol Surg - Assessing performance in brain tumor resection using a novel virtual reality simulator. ( 0,504239188389852 )
J Chem Inf Model - Revisiting the general solubility equation: in silico prediction of aqueous solubility incorporating the effect of topographical polar surface area. ( 0,50400256808582 )
J Chem Inf Model - Optimizing predictive performance of CASE Ultra expert system models using the applicability domains of individual toxicity alerts. ( 0,503653859269469 )
BMC Med Inform Decis Mak - Diabetic retinopathy risk prediction for fundus examination using sparse learning: a cross-sectional study. ( 0,502997053505086 )
Comput Methods Programs Biomed - Monitoring of anticoagulant therapy applying a dynamic statistical model. ( 0,501896916831232 )
J Chem Inf Model - QSAR modeling of imbalanced high-throughput screening data in PubChem. ( 0,500846152782526 )
J Chem Inf Model - Predicting myelosuppression of drugs from in silico models. ( 0,50075345385012 )
Comput Methods Programs Biomed - Modeling the glucose regulatory system in extreme preterm infants. ( 0,500676637236327 )
BMC Med Inform Decis Mak - Mining geriatric assessment data for in-patient fall prediction models and high-risk subgroups. ( 0,50066984915456 )
J Integr Bioinform - Classification of breast cancer subtypes by combining gene expression and DNA methylation data. ( 0,499159857777625 )
BMC Med Inform Decis Mak - Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model. ( 0,498840778575071 )
J Am Med Inform Assoc - Harvest: an open platform for developing web-based biomedical data discovery and reporting applications. ( 0,497069684103689 )
AMIA Annu Symp Proc - Predicting Surgical Risk: How Much Data is Enough? ( 0,497000720044831 )
Comput Math Methods Med - Multiscale autoregressive identification of neuroelectrophysiological systems. ( 0,496703507683903 )
AMIA Annu Symp Proc - Advanced proficiency EHR training: effect on physicians' EHR efficiency, EHR satisfaction and job satisfaction. ( 0,496519596324298 )
J Chem Inf Model - Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity. ( 0,495798057200114 )
Artif Intell Med - Fuzzy model identification of dengue epidemic in Colombia based on multiresolution analysis. ( 0,495745252691434 )
J Chem Inf Model - Structure based model for the prediction of phospholipidosis induction potential of small molecules. ( 0,493985081360683 )