J Chem Inf Model - Fusing dual-event data sets for Mycobacterium tuberculosis machine learning models and their evaluation.

Tópicos

{ model(3404) distribut(989) bayesian(671) }
{ data(3963) clinic(1234) research(1004) }
{ perform(999) metric(946) measur(919) }
{ compound(1573) activ(1297) structur(1058) }
{ learn(2355) train(1041) set(1003) }
{ howev(809) still(633) remain(590) }
{ model(2341) predict(2261) use(1141) }
{ model(2656) set(1616) predict(1553) }
{ cancer(2502) breast(956) screen(824) }
{ drug(1928) target(777) effect(648) }
{ search(2224) databas(1162) retriev(909) }
{ data(1737) use(1416) pattern(1282) }
{ concept(1167) ontolog(924) domain(897) }
{ can(774) often(719) complex(702) }
{ sequenc(1873) structur(1644) protein(1328) }
{ chang(1828) time(1643) increas(1301) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ studi(1119) effect(1106) posit(819) }
{ data(3008) multipl(1320) sourc(1022) }
{ structur(1116) can(940) graph(676) }
{ system(1976) rule(880) can(841) }
{ featur(3375) classif(2383) classifi(1994) }
{ take(945) account(800) differ(722) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ general(901) number(790) one(736) }
{ spatial(1525) area(1432) region(1030) }
{ signal(2180) analysi(812) frequenc(800) }
{ first(2504) two(1366) second(1323) }
{ activ(1138) subject(705) human(624) }
{ patient(1821) servic(1111) care(1106) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ use(1733) differ(960) four(931) }
{ implement(1333) system(1263) develop(1122) }
{ activ(1452) weight(1219) physic(1104) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ intervent(3218) particip(2042) group(1664) }
{ time(1939) patient(1703) rate(768) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ use(976) code(926) identifi(902) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

The search for new tuberculosis treatments continues as we need to find molecules that can act more quickly, be accommodated in multidrug regimens, and overcome ever increasing levels of drug resistance. Multiple large scale phenotypic high-throughput screens against Mycobacterium tuberculosis (Mtb) have generated dose response data, enabling the generation of machine learning models. These models also incorporated cytotoxicity data and were recently validated with a large external data set. A cheminformatics data-fusion approach followed by Bayesian machine learning, Support Vector Machine, or Recursive Partitioning model development (based on publicly available Mtb screening data) was used to compare individual data sets and subsequent combined models. A set of 1924 commercially available molecules with promising antitubercular activity (and lack of relative cytotoxicity to Vero cells) were used to evaluate the predictive nature of the models. We demonstrate that combining three data sets incorporating antitubercular and cytotoxicity data in Vero cells from our previous screens results in external validation receiver operator curve (ROC) of 0.83 (Bayesian or RP Forest). Models that do not have the highest 5-fold cross-validation ROC scores can outperform other models in a test set dependent manner. We demonstrate with predictions for a recently published set of Mtb leads from GlaxoSmithKline that no single machine learning model may be enough to identify compounds of interest. Data set fusion represents a further useful strategy for machine learning construction as illustrated with Mtb. Coverage of chemistry and Mtb target spaces may also be limiting factors for the whole-cell screening data generated to date.

Resumo Limpo

search new tuberculosi treatment continu need find molecul can act quick accommod multidrug regimen overcom ever increas level drug resist multipl larg scale phenotyp highthroughput screen mycobacterium tuberculosi mtb generat dose respons data enabl generat machin learn model model also incorpor cytotox data recent valid larg extern data set cheminformat datafus approach follow bayesian machin learn support vector machin recurs partit model develop base public avail mtb screen data use compar individu data set subsequ combin model set commerci avail molecul promis antitubercular activ lack relat cytotox vero cell use evalu predict natur model demonstr combin three data set incorpor antitubercular cytotox data vero cell previous screen result extern valid receiv oper curv roc bayesian rp forest model highest fold crossvalid roc score can outperform model test set depend manner demonstr predict recent publish set mtb lead glaxosmithklin singl machin learn model may enough identifi compound interest data set fusion repres use strategi machin learn construct illustr mtb coverag chemistri mtb target space may also limit factor wholecel screen data generat date

Resumos Similares

J Chem Inf Model - Introduction of a methodology for visualization and graphical interpretation of Bayesian classification models. ( 0,635122954593987 )
J Chem Inf Model - Quantitative structure-activity relationship models of clinical pharmacokinetics: clearance and volume of distribution. ( 0,608141724415785 )
Spat Spatiotemporal Epidemiol - The detection of spatially localised outbreaks in campylobacteriosis notification data. ( 0,596604186127826 )
J Chem Inf Model - Kinome-wide activity modeling from diverse public high-quality data sets. ( 0,592091737824944 )
IEEE J Biomed Health Inform - Identifying mammalian MicroRNA targets based on supervised distance metric learning. ( 0,582369578956698 )
Neural Comput - Exploitation of pairwise class distances for ordinal classification. ( 0,581824492232109 )
J Chem Inf Model - Pragmatic approaches to using computational methods to predict xenobiotic metabolism. ( 0,568909215393132 )
J Chem Inf Model - ??C NMR-distance matrix descriptors: optimal abstract 3D space granularity for predicting estrogen binding. ( 0,566324018718125 )
Med Biol Eng Comput - Fundamental principles of data assimilation underlying the Verdandi library: applications to biophysical model personalization within euHeart. ( 0,56259989007353 )
J Chem Inf Model - In silico deconstruction of ATP-competitive inhibitors of glycogen synthase kinase-3?. ( 0,561636354078145 )
J Chem Inf Model - A comparison of different QSAR approaches to modeling CYP450 1A2 inhibition. ( 0,559736910539419 )
J Chem Inf Model - Kernel-based partial least squares: application to fingerprint-based QSAR with model visualization. ( 0,558962618522223 )
J Chem Inf Model - Binary classification of aqueous solubility using support vector machines with reduction and recombination feature selection. ( 0,557707877750547 )
Brief. Bioinformatics - Similarity-based machine learning methods for predicting drug-target interactions: a brief review. ( 0,55737577682041 )
J Chem Inf Model - Exploring uncharted territories: predicting activity cliffs in structure-activity landscapes. ( 0,556210458318262 )
J Chem Inf Model - Development of dimethyl sulfoxide solubility models using 163,000 molecules: using a domain applicability metric to select more reliable predictions. ( 0,553933182596648 )
J Chem Inf Model - FAst MEtabolizer (FAME): A rapid and accurate predictor of sites of metabolism in multiple species by endogenous enzymes. ( 0,55321940699852 )
J Am Med Inform Assoc - Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data. ( 0,552393736250736 )
IEEE Trans Image Process - Saliency and gist features for target detection in satellite images. ( 0,549705410798656 )
J Chem Inf Model - QSPR prediction of the stability constants of gadolinium(III) complexes for magnetic resonance imaging. ( 0,548454463737034 )
J Chem Inf Model - Assessing relative bioactivity of chemical substances using quantitative molecular network topology analysis. ( 0,54098616163078 )
IEEE Trans Image Process - Decomposition-based transfer distance metric learning for image classification. ( 0,540527340058259 )
J Chem Inf Model - Generalized workflow for generating highly predictive in silico off-target activity models. ( 0,539714000322864 )
J Chem Inf Model - Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis. ( 0,539387584010939 )
J Chem Inf Model - Using random forest to model the domain applicability of another random forest model. ( 0,539051110550882 )
Brief. Bioinformatics - Added predictive value of high-throughput molecular data to clinical data and its validation. ( 0,535090180701024 )
J Chem Inf Model - Application of support vector machine to three-dimensional shape-based virtual screening using comprehensive three-dimensional molecular shape overlay with known inhibitors. ( 0,532797911144597 )
J Chem Inf Model - Prediction of compounds in different local structure-activity relationship environments using emerging chemical patterns. ( 0,531811787638647 )
Neural Comput - Parameter learning for alpha integration. ( 0,530936916739627 )
J Chem Inf Model - Experimentally validated HERG pharmacophore models as cardiotoxicity prediction tools. ( 0,530766909717486 )
Artif Intell Med - An evaluation of heuristics for rule ranking. ( 0,530180338771993 )
J Chem Inf Model - In silico assessment of chemical biodegradability. ( 0,530180338771993 )
J Chem Inf Model - Experimental and computational prediction of glass transition temperature of drugs. ( 0,529172976590144 )
Lifetime Data Anal - Bayesian local influence for survival models. ( 0,524862267593363 )
J Chem Inf Model - Predictive models for cytochrome p450 isozymes based on quantitative high throughput screening data. ( 0,524083824711258 )
Comput Math Methods Med - In silico modelling of tumour margin diffusion and infiltration: review of current status. ( 0,52407538745521 )
IEEE Trans Image Process - Shape-based normalized cuts using spectral relaxation for biomedical segmentation. ( 0,523960432353689 )
J Chem Inf Model - In silico prediction of aqueous solubility using simple QSPR models: the importance of phenol and phenol-like moieties. ( 0,523500986906097 )
J Chem Inf Model - Coping with unbalanced class data sets in oral absorption models. ( 0,522040698418971 )
Comput. Biol. Med. - Towards automatic detection of atrial fibrillation: A hybrid computational approach. ( 0,521073452545341 )
J Chem Inf Model - Classification of compounds with distinct or overlapping multi-target activities and diverse molecular mechanisms using emerging chemical patterns. ( 0,518357527247388 )
AMIA Annu Symp Proc - Pediatric readmission classification using stacked regularized logistic regression models. ( 0,517995429759031 )
J Chem Inf Model - Global quantitative structure-activity relationship models vs selected local models as predictors of off-target activities for project compounds. ( 0,517883398045519 )
J Chem Inf Model - Predicting potent compounds via model-based global optimization. ( 0,517232885700208 )
Methods Inf Med - The evolution of boosting algorithms. From machine learning to statistical modelling. ( 0,516343489743288 )
J Chem Inf Model - Discovery and design of tricyclic scaffolds as protein kinase CK2 (CK2) inhibitors through a combination of shape-based virtual screening and structure-based molecular modification. ( 0,515416375560112 )
J Chem Inf Model - Prediction of compound potency changes in matched molecular pairs using support vector regression. ( 0,515075449663203 )
J Chem Inf Model - Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods. ( 0,512346661916281 )
Neural Comput - Causal discovery via reproducing kernel Hilbert space embeddings. ( 0,51211195035418 )
J Chem Inf Model - Discovering new agents active against methicillin-resistant Staphylococcus aureus with ligand-based approaches. ( 0,511473502360215 )
Comput Biol Chem - Ranking of microRNA target prediction scores by Pareto front analysis. ( 0,511327621880968 )
J Chem Inf Model - A searchable map of PubChem. ( 0,508779803041226 )
J Chem Inf Model - Prediction of activity cliffs using support vector machines. ( 0,507817376041827 )
J Chem Inf Model - Comparison of random forest and Pipeline Pilot Na?ve Bayes in prospective QSAR predictions. ( 0,506662358707704 )
J Chem Inf Model - Searching for closely related ligands with different mechanisms of action using machine learning and mapping algorithms. ( 0,506008859203747 )
J Chem Inf Model - Introduction of the conditional correlated Bernoulli model of similarity value distributions and its application to the prospective prediction of fingerprint search performance. ( 0,504882813573498 )
Neural Comput - Bayesian sparse partial least squares. ( 0,503914240505113 )
J Med Syst - Performance evaluation of a web-based system to exchange Electronic Health Records using Queueing model (M/M/1). ( 0,502114743760381 )
J Chem Inf Model - How experimental errors influence drug metabolism and pharmacokinetic QSAR/QSPR models. ( 0,501721244116991 )
J Chem Inf Model - An unbiased method to build benchmarking sets for ligand-based virtual screening and its application to GPCRs. ( 0,501231913886085 )
BMC Med Inform Decis Mak - A simulation model of colorectal cancer surveillance and recurrence. ( 0,500935144335076 )
J Chem Inf Model - Jointly handling potency and toxicity of antimicrobial peptidomimetics by simple rules from desirability theory and chemoinformatics. ( 0,499829429897287 )
Comput Biol Chem - Functional characterization of plant small RNAs based on next-generation sequencing data. ( 0,49817605852661 )
Comput. Biol. Med. - Expectation-maximization technique for fibro-glandular discs detection in mammography images. ( 0,496247921453896 )
Comput. Biol. Med. - A prediction model of drug-induced ototoxicity developed by an optimal support vector machine (SVM) method. ( 0,495805666534818 )
Med Decis Making - The role of value-of-information analysis in a health care research priority setting: a theoretical case study. ( 0,495119080104646 )
J Chem Inf Model - Analysis and study of molecule data sets using snowflake diagrams of weighted maximum common subgraph trees. ( 0,493679627924363 )
J Chem Inf Model - A Bayesian approach to in silico blood-brain barrier penetration modeling. ( 0,493281376934755 )
J Chem Inf Model - Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions. ( 0,491641687784201 )
J Chem Inf Model - Target-specific support vector machine scoring in structure-based virtual screening: computational validation, in vitro testing in kinases, and effects on lung cancer cell proliferation. ( 0,490964088979241 )
J Chem Inf Model - Construction and use of fragment-augmented molecular Hasse diagrams. ( 0,49069270307239 )
Comput. Biol. Med. - Robust prediction of protein subcellular localization combining PCA and WSVMs. ( 0,490638548828806 )
J Chem Inf Model - Comprehensive comparison of ligand-based virtual screening tools against the DUD data set reveals limitations of current 3D methods. ( 0,49055995389147 )
J Chem Inf Model - Using information from historical high-throughput screens to predict active compounds. ( 0,490234648524292 )
Med Decis Making - The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes. ( 0,489366725580852 )
J Chem Inf Model - Design of multitarget activity landscapes that capture hierarchical activity cliff distributions. ( 0,488929542317223 )
J Biomed Inform - Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: an application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson's Disease Questionnaire (PDQ-39). ( 0,487826133064854 )
Med Decis Making - Linear regression metamodeling as a tool to summarize and present simulation model results. ( 0,487058641932385 )
J Chem Inf Model - Exploring polypharmacology using a ROCS-based target fishing approach. ( 0,486066010813159 )
J Chem Inf Model - Classification of cytochrome P450 inhibitors and noninhibitors using combined classifiers. ( 0,485282877952998 )
J Chem Inf Model - New fragment weighting scheme for the Bayesian inference network in ligand-based virtual screening. ( 0,485163515496702 )
IEEE Trans Image Process - View-based discriminative probabilistic modeling for 3D object retrieval and recognition. ( 0,484384140285835 )
IEEE Trans Image Process - A marked point process for modeling lidar waveforms. ( 0,482705707229543 )
J Chem Inf Model - Predicting myelosuppression of drugs from in silico models. ( 0,48250653457186 )
J Chem Inf Model - Improving the use of ranking in virtual screening against HIV-1 integrase with triangular numbers and including ligand profiling with antitargets. ( 0,482024321754259 )
Artif Intell Med - On the interplay of machine learning and background knowledge in image interpretation by Bayesian networks. ( 0,481910060820679 )
J Chem Inf Model - XenoSite: accurately predicting CYP-mediated sites of metabolism with neural networks. ( 0,481628439916092 )
J Chem Inf Model - A new protocol for predicting novel GSK-3? ATP competitive inhibitors. ( 0,481608718096005 )
IEEE Trans Image Process - Toward an impairment metric for stereoscopic video: a full-reference video quality metric to assess compressed stereoscopic video. ( 0,481073919286052 )
Lifetime Data Anal - Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis. ( 0,480508233623451 )
J Chem Inf Model - Comparison of confirmed inactive and randomly selected compounds as negative training examples in support vector machine-based virtual screening. ( 0,479867575979824 )
IEEE Trans Pattern Anal Mach Intell - Weakly Supervised Recognition of Daily Life Activities with Wearable Sensors. ( 0,47894970455252 )
J Chem Inf Model - Looking back to the future: predicting in vivo efficacy of small molecules versus Mycobacterium tuberculosis. ( 0,478020357354101 )
Med Decis Making - Evaluation of markers and risk prediction models: overview of relationships between NRI and decision-analytic measures. ( 0,477682366083231 )
IEEE Trans Image Process - Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. ( 0,476819728199213 )
J Chem Inf Model - A new approach to radial basis function approximation and its application to QSAR. ( 0,475897640138043 )
J Chem Inf Model - QSAR classification model for antibacterial compounds and its use in virtual screening. ( 0,475799553808232 )
J. Comput. Biol. - Maximum parsimony, substitution model, and probability phylogenetic trees. ( 0,475285651711083 )
J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. ( 0,473997466263528 )
Artif Intell Med - Machine learning of clinical performance in a pancreatic cancer database. ( 0,473829282514973 )