J Chem Inf Model - In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Na?ve Bayes and Parzen-Rosenblatt window.

Tópicos

{ compound(1573) activ(1297) structur(1058) }
{ method(1219) similar(1157) match(930) }
{ learn(2355) train(1041) set(1003) }
{ method(1969) cluster(1462) data(1082) }
{ model(3404) distribut(989) bayesian(671) }
{ control(1307) perform(991) simul(935) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(2317) use(1299) case(1017) }
{ sampl(1606) size(1419) use(1276) }
{ bind(1733) structur(1185) ligand(1036) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ concept(1167) ontolog(924) domain(897) }
{ general(901) number(790) one(736) }
{ search(2224) databas(1162) retriev(909) }
{ model(3480) simul(1196) paramet(876) }
{ high(1669) rate(1365) level(1280) }
{ framework(1458) process(801) describ(734) }
{ clinic(1479) use(1117) guidelin(835) }
{ studi(1410) differ(1259) use(1210) }
{ health(1844) social(1437) communiti(874) }
{ use(976) code(926) identifi(902) }
{ drug(1928) target(777) effect(648) }
{ can(774) often(719) complex(702) }
{ inform(2794) health(2639) internet(1427) }
{ imag(1057) registr(996) error(939) }
{ featur(3375) classif(2383) classifi(1994) }
{ error(1145) method(1030) estim(1020) }
{ model(2220) cell(1177) simul(1124) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ model(2656) set(1616) predict(1553) }
{ activ(1138) subject(705) human(624) }
{ analysi(2126) use(1163) compon(1037) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ sequenc(1873) structur(1644) protein(1328) }
{ network(2748) neural(1063) input(814) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

In this study, two probabilistic machine-learning algorithms were compared for in silico target prediction of bioactive molecules, namely the well-established Laplacian-modified Na?ve Bayes classifier (NB) and the more recently introduced (to Cheminformatics) Parzen-Rosenblatt Window. Both classifiers were trained in conjunction with circular fingerprints on a large data set of bioactive compounds extracted from ChEMBL, covering 894 human protein targets with more than 155,000 ligand-protein pairs. This data set is also provided as a benchmark data set for future target prediction methods due to its size as well as the number of bioactivity classes it contains. In addition to evaluating the methods, different performance measures were explored. This is not as straightforward as in binary classification settings, due to the number of classes, the possibility of multiple class memberships, and the need to translate model scores into "yes/no" predictions for assessing model performance. Both algorithms achieved a recall of correct targets that exceeds 80% in the top 1% of predictions. Performance depends significantly on the underlying diversity and size of a given class of bioactive compounds, with small classes and low structural similarity affecting both algorithms to different degrees. When tested on an external test set extracted from WOMBAT covering more than 500 targets by excluding all compounds with Tanimoto similarity above 0.8 to compounds from the ChEMBL data set, the current methodologies achieved a recall of 63.3% and 66.6% among the top 1% for Na?ve Bayes and Parzen-Rosenblatt Window, respectively. While those numbers seem to indicate lower performance, they are also more realistic for settings where protein targets need to be established for novel chemical substances.

Resumo Limpo

studi two probabilist machinelearn algorithm compar silico target predict bioactiv molecul name wellestablish laplacianmodifi nave bay classifi nb recent introduc cheminformat parzenrosenblatt window classifi train conjunct circular fingerprint larg data set bioactiv compound extract chembl cover human protein target ligandprotein pair data set also provid benchmark data set futur target predict method due size well number bioactiv class contain addit evalu method differ perform measur explor straightforward binari classif set due number class possibl multipl class membership need translat model score yesno predict assess model perform algorithm achiev recal correct target exceed top predict perform depend signific under divers size given class bioactiv compound small class low structur similar affect algorithm differ degre test extern test set extract wombat cover target exclud compound tanimoto similar compound chembl data set current methodolog achiev recal among top nave bay parzenrosenblatt window respect number seem indic lower perform also realist set protein target need establish novel chemic substanc

Resumos Similares

J Chem Inf Model - Noncontiguous atom matching structural similarity function. ( 0,79939689576024 )
J Chem Inf Model - SHAFTS: a hybrid approach for 3D molecular similarity calculation. 1. Method and assessment of virtual screening. ( 0,78316558998482 )
J Chem Inf Model - Structural similarity based kriging for quantitative structure activity and property relationship modeling. ( 0,782554249684179 )
J Chem Inf Model - Prediction of activity cliffs using support vector machines. ( 0,773995794669885 )
J Chem Inf Model - Hit expansion approaches using multiple similarity methods and virtualized query structures. ( 0,754304724399461 )
J Chem Inf Model - Activity-aware clustering of high throughput screening data and elucidation of orthogonal structure-activity relationships. ( 0,73874591375247 )
J Chem Inf Model - Novel method for pharmacophore analysis by examining the joint pharmacophore space. ( 0,73363295550196 )
J Chem Inf Model - MMP-Cliffs: systematic identification of activity cliffs on the basis of matched molecular pairs. ( 0,731462322619233 )
J Chem Inf Model - SimG: an alignment based method for evaluating the similarity of small molecules and binding sites. ( 0,731340764426876 )
J Chem Inf Model - Similarity searching for potent compounds using feature selection. ( 0,725055168208951 )
J Chem Inf Model - Application of support vector machine to three-dimensional shape-based virtual screening using comprehensive three-dimensional molecular shape overlay with known inhibitors. ( 0,717490300009893 )
J Chem Inf Model - COSMOsim3D: 3D-similarity and alignment based on COSMO polarization charge densities. ( 0,715856607315392 )
J Chem Inf Model - Ligand- and structure-based virtual screening for clathrodin-derived human voltage-gated sodium channel modulators. ( 0,714804377805682 )
J Chem Inf Model - Systematic assessment of compound series with SAR transfer potential. ( 0,712151524918826 )
J Chem Inf Model - Development of Ecom50 and retention index models for nontargeted metabolomics: identification of 1,3-dicyclohexylurea in human serum by HPLC/mass spectrometry. ( 0,711373432969474 )
J Chem Inf Model - Library enhancement through the wisdom of crowds. ( 0,711284543067533 )
J Chem Inf Model - Large-scale assessment of activity landscape feature probabilities of bioactive compounds. ( 0,710132593521492 )
J Chem Inf Model - Navigating high-dimensional activity landscapes: design and application of the ligand-target differentiation map. ( 0,708113533299622 )
J Chem Inf Model - Prediction of new bioactive molecules using a Bayesian belief network. ( 0,703190947357834 )
J Chem Inf Model - G-protein coupled receptors virtual screening using genetic algorithm focused chemical space. ( 0,702511216814869 )
J Chem Inf Model - SABRE: ligand/structure-based virtual screening approach using consensus molecular-shape pattern recognition. ( 0,702336857674935 )
J Chem Inf Model - Identification of descriptors capturing compound class-specific features by mutual information analysis. ( 0,702320104513874 )
J Chem Inf Model - Mining the ChEMBL database: an efficient chemoinformatics workflow for assembling an ion channel-focused screening library. ( 0,699877015661613 )
J Chem Inf Model - Compound optimization through data set-dependent chemical transformations. ( 0,699433156827841 )
Brief. Bioinformatics - Toward more realistic drug-target interaction predictions. ( 0,698561561511031 )
J Chem Inf Model - From activity cliffs to activity ridges: informative data structures for SAR analysis. ( 0,697768189589364 )
J Chem Inf Model - Multitarget structure-activity relationships characterized by activity-difference maps and consensus similarity measure. ( 0,697092470749419 )
J Chem Inf Model - Target-independent prediction of drug synergies using only drug lipophilicity. ( 0,696926964757569 )
J Chem Inf Model - Quantitative structure-activity relationship models of chemical transformations from matched pairs analyses. ( 0,692904316520377 )
J Chem Inf Model - Target-specific support vector machine scoring in structure-based virtual screening: computational validation, in vitro testing in kinases, and effects on lung cancer cell proliferation. ( 0,691415051267571 )
J Chem Inf Model - Optimization of molecular representativeness. ( 0,690966660453811 )
J Chem Inf Model - In silico enzymatic synthesis of a 400,000 compound biochemical database for nontargeted metabolomics. ( 0,69028380684938 )
J Chem Inf Model - Discovery of novel Pim-1 kinase inhibitors by a hierarchical multistage virtual screening approach based on SVM model, pharmacophore, and molecular docking. ( 0,689737484178089 )
J Chem Inf Model - Visualization and virtual screening of the chemical universe database GDB-17. ( 0,684967331068569 )
J Chem Inf Model - Characterizing the diversity and biological relevance of the MLPCN assay manifold and screening set. ( 0,683846354249283 )
Curr Comput Aided Drug Des - Development of Chemical Compound Libraries for In Silico Drug Screening. ( 0,683015179341115 )
J Chem Inf Model - Modeling and benchmark data set for the inhibition of c-Jun N-terminal kinase-3. ( 0,682806848782511 )
J Chem Inf Model - Design of multitarget activity landscapes that capture hierarchical activity cliff distributions. ( 0,681971542038305 )
J Chem Inf Model - Structure-based virtual screening approach for discovery of covalently bound ligands. ( 0,681943296516838 )
J Chem Inf Model - Extending the activity cliff concept: structural categorization of activity cliffs and systematic identification of different types of cliffs in the ChEMBL database. ( 0,679802005911859 )
J Chem Inf Model - Improving classical substructure-based virtual screening to handle extrapolation challenges. ( 0,679494276987817 )
J Chem Inf Model - ColBioS-FlavRC: a collection of bioselective flavonoids and related compounds filtered from high-throughput screening outcomes. ( 0,678841980765076 )
J Chem Inf Model - Multiple e-pharmacophore modeling, 3D-QSAR, and high-throughput virtual screening of hepatitis C virus NS5B polymerase inhibitors. ( 0,677830739788722 )
J Chem Inf Model - Similarity boosted quantitative structure-activity relationship--a systematic study of enhancing structural descriptors by molecular similarity. ( 0,677521628065058 )
J Chem Inf Model - Searching for recursively defined generic chemical patterns in nonenumerated fragment spaces. ( 0,677364860317986 )
J Chem Inf Model - Visual characterization and diversity quantification of chemical libraries: 2. Analysis and selection of size-independent, subspace-specific diversity indices. ( 0,67726384394728 )
J Chem Inf Model - Boosting virtual screening enrichments with data fusion: coalescing hits from two-dimensional fingerprints, shape, and docking. ( 0,677007697954535 )
J Chem Inf Model - Discovery of new selective human aldose reductase inhibitors through virtual screening multiple binding pocket conformations. ( 0,676091270583794 )
J Chem Inf Model - Fighting obesity with a sugar-based library: discovery of novel MCH-1R antagonists by a new computational-VAST approach for exploration of GPCR binding sites. ( 0,6756078740247 )
J Chem Inf Model - Identifying compound-target associations by combining bioactivity profile similarity search and public databases mining. ( 0,674305561379435 )
J Chem Inf Model - An integrated virtual screening approach for VEGFR-2 inhibitors. ( 0,673693062784194 )
J Chem Inf Model - Structure-based design and screen of novel inhibitors for class II 3-hydroxy-3-methylglutaryl coenzyme A reductase from Streptococcus pneumoniae. ( 0,671720858683245 )
J Chem Inf Model - Automated recycling of chemistry for virtual screening and library design. ( 0,671634198245246 )
J Chem Inf Model - Systematic identification of scaffolds representing compounds active against individual targets and single or multiple target families. ( 0,670193270406466 )
J Chem Inf Model - Application of computer-aided drug repurposing in the search of new cruzipain inhibitors: discovery of amiodarone and bromocriptine inhibitory effects. ( 0,668318969709971 )
Sci Data - Quantum chemistry structures and properties of 134 kilo molecules. ( 0,668227682658646 )
J Chem Inf Model - Bioturbo similarity searching: combining chemical and biological similarity to discover structurally diverse bioactive molecules. ( 0,668172810379078 )
J Chem Inf Model - Identification of a novel inhibitor of dengue virus protease through use of a virtual screening drug discovery Web portal. ( 0,667425289807014 )
J Chem Inf Model - Enrichment of chemical libraries docked to protein conformational ensembles and application to aldehyde dehydrogenase 2. ( 0,667283517575523 )
J Chem Inf Model - De novo design of drug-like molecules by a fragment-based molecular evolutionary approach. ( 0,666707295383124 )
J Chem Inf Model - Identification of novel liver X receptor activators by structure-based modeling. ( 0,666610340929709 )
J Chem Inf Model - Capturing structure-activity relationships from chemogenomic spaces. ( 0,665968011637242 )
J Chem Inf Model - Locating sweet spots for screening hits and evaluating pan-assay interference filters from the performance analysis of two lead-like libraries. ( 0,665710833583604 )
J Chem Inf Model - Fragment-based lead discovery and design. ( 0,663474547637604 )
J Chem Inf Model - Evaluation and optimization of virtual screening workflows with DEKOIS 2.0--a public library of challenging docking benchmark sets. ( 0,663325658073316 )
J Chem Inf Model - Visual characterization and diversity quantification of chemical libraries: 1. creation of delimited reference chemical subspaces. ( 0,662987968454677 )
J Chem Inf Model - Searching for closely related ligands with different mechanisms of action using machine learning and mapping algorithms. ( 0,662706505043054 )
J Chem Inf Model - Conditional probabilistic analysis for prediction of the activity landscape and relative compound activities. ( 0,661569151602576 )
J Chem Inf Model - How do 2D fingerprints detect structurally diverse active compounds? Revealing compound subset-specific fingerprint features through systematic selection. ( 0,660483292321534 )
J Chem Inf Model - Increasing the coverage of medicinal chemistry-relevant space in commercial fragments screening. ( 0,660356464133432 )
J Chem Inf Model - Scanning structure-activity relationships with structure-activity similarity and related maps: from consensus activity cliffs to selectivity switches. ( 0,659803852682149 )
J Chem Inf Model - An unbiased method to build benchmarking sets for ligand-based virtual screening and its application to GPCRs. ( 0,659410190138721 )
J Chem Inf Model - Fighting high molecular weight in bioactive molecules with sub-pharmacophore-based virtual screening. ( 0,659405624894243 )
J Chem Inf Model - Neighborhood-based prediction of novel active compounds from SAR matrices. ( 0,658815024126489 )
J Chem Inf Model - TIN-a combinatorial compound collection of synthetically feasible multicomponent synthesis products. ( 0,658748242490007 )
J Chem Inf Model - Freely available conformer generation methods: how good are they? ( 0,657540131339752 )
J Chem Inf Model - Ligand and decoy sets for docking to G protein-coupled receptors. ( 0,657132766156836 )
J Chem Inf Model - Natural product-like virtual libraries: recursive atom-based enumeration. ( 0,656312159169413 )
J Chem Inf Model - Introduction of target cliffs as a concept to identify and describe complex molecular selectivity patterns. ( 0,656034204876676 )
J Chem Inf Model - Discovery of novel histamine H4 and serotonin transporter ligands using the topological feature tree descriptor. ( 0,655386807820697 )
J Chem Inf Model - Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity. ( 0,655369970047381 )
J Chem Inf Model - Identification of multitarget activity ridges in high-dimensional bioactivity spaces. ( 0,655095728536997 )
J Chem Inf Model - Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. ( 0,654988027775204 )
J Chem Inf Model - Polypharmacology directed compound data mining: identification of promiscuous chemotypes with different activity profiles and comparison to approved drugs. ( 0,653331169071386 )
J Chem Inf Model - Discovery of a7-nicotinic receptor ligands by virtual screening of the chemical universe database GDB-13. ( 0,652987608114148 )
J Chem Inf Model - Harvesting classification trees for drug discovery. ( 0,652474325134588 )
J Chem Inf Model - Construction and use of fragment-augmented molecular Hasse diagrams. ( 0,652198359871162 )
J Chem Inf Model - How diverse are diversity assessment methods? A comparative analysis and benchmarking of molecular descriptor space. ( 0,65211739169502 )
J Chem Inf Model - Introduction of a methodology for visualization and graphical interpretation of Bayesian classification models. ( 0,652110240269861 )
J Chem Inf Model - FINDSITE(comb): a threading/structure-based, proteomic-scale virtual ligand screening approach. ( 0,65134354703224 )
J Chem Inf Model - Prediction of synthetic accessibility based on commercially available compound databases. ( 0,650856879074015 )
J Chem Inf Model - 3D molecular descriptors important for clinical success. ( 0,650771957002553 )
J Chem Inf Model - Identification of 1,2,5-oxadiazoles as a new class of SENP2 inhibitors using structure based virtual screening. ( 0,649313567488393 )
J Chem Inf Model - Combining horizontal and vertical substructure relationships in scaffold hierarchies for activity prediction. ( 0,648219788963122 )
J Chem Inf Model - Selection of in silico drug screening results for G-protein-coupled receptors by using universal active probes. ( 0,648156337416444 )
J Chem Inf Model - Molecular modeling on pyrimidine-urea inhibitors of TNF-a production: an integrated approach using a combination of molecular docking, classification techniques, and 3D-QSAR CoMSIA. ( 0,647675102798222 )
J Chem Inf Model - Application of quantitative structure-activity relationship models of 5-HT1A receptor binding to virtual screening identifies novel and potent 5-HT1A ligands. ( 0,64734582503025 )
J Chem Inf Model - Using novel descriptor accounting for ligand-receptor interactions to define and visually explore biologically relevant chemical space. ( 0,647314807410261 )
J Chem Inf Model - Exploration of 3D activity cliffs on the basis of compound binding modes and comparison of 2D and 3D cliffs. ( 0,647163050215414 )
J Chem Inf Model - Automated selection of compounds with physicochemical properties to maximize bioavailability and druglikeness. ( 0,646997888399832 )