J Chem Inf Model - BioSM: metabolomics tool for identifying endogenous mammalian biochemical structures in chemical structure space.


{ compound(1573) activ(1297) structur(1058) }
{ sampl(1606) size(1419) use(1276) }
{ structur(1116) can(940) graph(676) }
{ gene(2352) biolog(1181) express(1162) }
{ perform(1367) use(1326) method(1137) }
{ assess(1506) score(1403) qualiti(1306) }
{ howev(809) still(633) remain(590) }
{ can(774) often(719) complex(702) }
{ model(2656) set(1616) predict(1553) }
{ data(3008) multipl(1320) sourc(1022) }
{ detect(2391) sensit(1101) algorithm(908) }
{ research(1085) discuss(1038) issu(1018) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ chang(1828) time(1643) increas(1301) }
{ search(2224) databas(1162) retriev(909) }
{ group(2977) signific(1463) compar(1072) }
{ use(2086) technolog(871) perceiv(783) }
{ process(1125) use(805) approach(778) }
{ problem(2511) optim(1539) algorithm(950) }
{ extract(1171) text(1153) clinic(932) }
{ studi(1119) effect(1106) posit(819) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ care(1570) inform(1187) nurs(1089) }
{ featur(1941) imag(1645) propos(1176) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ age(1611) year(1155) adult(843) }
{ analysi(2126) use(1163) compon(1037) }
{ high(1669) rate(1365) level(1280) }
{ result(1111) use(1088) new(759) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ case(1353) use(1143) diagnosi(1136) }
{ studi(1410) differ(1259) use(1210) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ method(1969) cluster(1462) data(1082) }


The structural identification of unknown biochemical compounds in complex biofluids continues to be a major challenge in metabolomics research. Using LC/MS, there are currently two major options for solving this problem: searching small biochemical databases, which often do not contain the unknown of interest or searching large chemical databases which include large numbers of nonbiochemical compounds. Searching larger chemical databases (larger chemical space) increases the odds of identifying an unknown biochemical compound, but only if nonbiochemical structures can be eliminated from consideration. In this paper we present BioSM; a cheminformatics tool that uses known endogenous mammalian biochemical compounds (as scaffolds) and graph matching methods to identify endogenous mammalian biochemical structures in chemical structure space. The results of a comprehensive set of empirical experiments suggest that BioSM identifies endogenous mammalian biochemical structures with high accuracy. In a leave-one-out cross validation experiment, BioSM correctly predicted 95% of 1388 Kyoto Encyclopedia of Genes and Genomes (KEGG) compounds as endogenous mammalian biochemicals using 1565 scaffolds. Analysis of two additional biological data sets containing 2330 human metabolites (HMDB) and 2416 plant secondary metabolites (KEGG) resulted in biochemical annotations of 89% and 72% of the compounds, respectively. When a data set of 3895 drugs (DrugBank and USAN) was tested, 48% of these structures were predicted to be biochemical. However, when a set of synthetic chemical compounds (Chembridge and Chemsynthesis databases) were examined, only 29% of the 458,207 structures were predicted to be biochemical. Moreover, BioSM predicted that 34% of 883,199 randomly selected compounds from PubChem were biochemical. We then expanded the scaffold list to 3927 biochemical compounds and reevaluated the above data sets to determine whether scaffold number influenced model performance. Although there were significant improvements in model sensitivity and specificity using the larger scaffold list, the data set comparison results were very similar. These results suggest that additional biochemical scaffolds will not further improve our representation of biochemical structure space and that the model is reasonably robust. BioSM provides a qualitative (yes/no) and quantitative (ranking) method for endogenous mammalian biochemical annotation of chemical space and, thus, will be useful in the identification of unknown biochemical structures in metabolomics. BioSM is freely available at http://metabolomics.pharm.uconn.edu.

Resumo Limpo

structur identif unknown biochem compound complex biofluid continu major challeng metabolom research use lcms current two major option solv problem search small biochem databas often contain unknown interest search larg chemic databas includ larg number nonbiochem compound search larger chemic databas larger chemic space increas odd identifi unknown biochem compound nonbiochem structur can elimin consider paper present biosm cheminformat tool use known endogen mammalian biochem compound scaffold graph match method identifi endogen mammalian biochem structur chemic structur space result comprehens set empir experi suggest biosm identifi endogen mammalian biochem structur high accuraci leaveoneout cross valid experi biosm correct predict kyoto encyclopedia gene genom kegg compound endogen mammalian biochem use scaffold analysi two addit biolog data set contain human metabolit hmdb plant secondari metabolit kegg result biochem annot compound respect data set drug drugbank usan test structur predict biochem howev set synthet chemic compound chembridg chemsynthesi databas examin structur predict biochem moreov biosm predict random select compound pubchem biochem expand scaffold list biochem compound reevalu data set determin whether scaffold number influenc model perform although signific improv model sensit specif use larger scaffold list data set comparison result similar result suggest addit biochem scaffold will improv represent biochem structur space model reason robust biosm provid qualit yesno quantit rank method endogen mammalian biochem annot chemic space thus will use identif unknown biochem structur metabolom biosm freeli avail httpmetabolomicspharmuconnedu

Resumos Similares

J Chem Inf Model - Polypharmacology directed compound data mining: identification of promiscuous chemotypes with different activity profiles and comparison to approved drugs. ( 0,891664356458091 )
J Chem Inf Model - Application of computer-aided drug repurposing in the search of new cruzipain inhibitors: discovery of amiodarone and bromocriptine inhibitory effects. ( 0,888308218667382 )
J Chem Inf Model - Prediction of individual compounds forming activity cliffs using emerging chemical patterns. ( 0,883216310773656 )
J Chem Inf Model - How diverse are diversity assessment methods? A comparative analysis and benchmarking of molecular descriptor space. ( 0,881185405391022 )
J Chem Inf Model - Natural product-like virtual libraries: recursive atom-based enumeration. ( 0,877577435462098 )
J Chem Inf Model - Scaffold diversity of exemplified medicinal chemistry space. ( 0,87721837790043 )
J Chem Inf Model - Navigating high-dimensional activity landscapes: design and application of the ligand-target differentiation map. ( 0,874354958625715 )
J Chem Inf Model - Combining horizontal and vertical substructure relationships in scaffold hierarchies for activity prediction. ( 0,873337816644014 )
J Chem Inf Model - Discovery of novel histamine H4 and serotonin transporter ligands using the topological feature tree descriptor. ( 0,870036091198512 )
J Chem Inf Model - Identifying compound-target associations by combining bioactivity profile similarity search and public databases mining. ( 0,869756289036657 )
J Chem Inf Model - Harvesting classification trees for drug discovery. ( 0,868689067240526 )
J Chem Inf Model - Fighting high molecular weight in bioactive molecules with sub-pharmacophore-based virtual screening. ( 0,867544473146474 )
J Chem Inf Model - Locating sweet spots for screening hits and evaluating pan-assay interference filters from the performance analysis of two lead-like libraries. ( 0,867156151590123 )
J Chem Inf Model - Atom pair 2D-fingerprints perceive 3D-molecular shape and pharmacophores for very fast virtual screening of ZINC and GDB-17. ( 0,864708214052411 )
J Chem Inf Model - Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. ( 0,863134184173893 )
J Chem Inf Model - A multivariate chemical similarity approach to search for drugs of potential environmental concern. ( 0,860930542580244 )
J Chem Inf Model - Extending the activity cliff concept: structural categorization of activity cliffs and systematic identification of different types of cliffs in the ChEMBL database. ( 0,859159316821335 )
J Chem Inf Model - Identification of novel malarial cysteine protease inhibitors using structure-based virtual screening of a focused cysteine protease inhibitor library. ( 0,85696979234722 )
J Chem Inf Model - TIN-a combinatorial compound collection of synthetically feasible multicomponent synthesis products. ( 0,856388965859023 )
J Chem Inf Model - Characterizing the diversity and biological relevance of the MLPCN assay manifold and screening set. ( 0,856216378855957 )
J Chem Inf Model - Identification of multitarget activity ridges in high-dimensional bioactivity spaces. ( 0,855990643214175 )
J Chem Inf Model - From activity cliffs to activity ridges: informative data structures for SAR analysis. ( 0,85334122914647 )
J Chem Inf Model - Identification of novel liver X receptor activators by structure-based modeling. ( 0,849674347286747 )
J Chem Inf Model - In silico enzymatic synthesis of a 400,000 compound biochemical database for nontargeted metabolomics. ( 0,849526161769665 )
J Chem Inf Model - Automated recycling of chemistry for virtual screening and library design. ( 0,847425228876041 )
J Chem Inf Model - Increasing the coverage of medicinal chemistry-relevant space in commercial fragments screening. ( 0,84622650777178 )
J Chem Inf Model - Identification of 1,2,5-oxadiazoles as a new class of SENP2 inhibitors using structure based virtual screening. ( 0,845898898783316 )
Curr Comput Aided Drug Des - Development of Chemical Compound Libraries for In Silico Drug Screening. ( 0,845429019659044 )
J Chem Inf Model - Target-independent prediction of drug synergies using only drug lipophilicity. ( 0,843706256559067 )
J Chem Inf Model - G-protein coupled receptors virtual screening using genetic algorithm focused chemical space. ( 0,843146214818721 )
J Chem Inf Model - Conditional probabilistic analysis for prediction of the activity landscape and relative compound activities. ( 0,84306196602077 )
J Chem Inf Model - Compound optimization through data set-dependent chemical transformations. ( 0,842454830569242 )
J Chem Inf Model - Identification of a novel inhibitor of dengue virus protease through use of a virtual screening drug discovery Web portal. ( 0,842202590675766 )
J Chem Inf Model - Searching for recursively defined generic chemical patterns in nonenumerated fragment spaces. ( 0,842169772306035 )
J Chem Inf Model - Compound set enrichment: a novel approach to analysis of primary HTS data. ( 0,840715994570424 )
J Chem Inf Model - Discovery and design of tricyclic scaffolds as protein kinase CK2 (CK2) inhibitors through a combination of shape-based virtual screening and structure-based molecular modification. ( 0,840572430760086 )
J Chem Inf Model - QSAR classification model for antibacterial compounds and its use in virtual screening. ( 0,835995046706551 )
J Chem Inf Model - Ligand- and structure-based virtual screening for clathrodin-derived human voltage-gated sodium channel modulators. ( 0,83525267290579 )
J Chem Inf Model - Similarity boosted quantitative structure-activity relationship--a systematic study of enhancing structural descriptors by molecular similarity. ( 0,835085006079178 )
J Chem Inf Model - MQN-mapplet: visualization of chemical space with interactive maps of DrugBank, ChEMBL, PubChem, GDB-11, and GDB-13. ( 0,833777597592105 )
J Chem Inf Model - Design of multitarget activity landscapes that capture hierarchical activity cliff distributions. ( 0,833375081957208 )
J Chem Inf Model - Molecular topology analysis of the differences between drugs, clinical candidate compounds, and bioactive molecules. ( 0,833021243647017 )
J Chem Inf Model - Visual characterization and diversity quantification of chemical libraries: 1. creation of delimited reference chemical subspaces. ( 0,832804673568076 )
J Chem Inf Model - Knowledge-based libraries for predicting the geometric preferences of druglike molecules. ( 0,83156940506737 )
J Chem Inf Model - Bioturbo similarity searching: combining chemical and biological similarity to discover structurally diverse bioactive molecules. ( 0,831182643043402 )
J Chem Inf Model - Mining the ChEMBL database: an efficient chemoinformatics workflow for assembling an ion channel-focused screening library. ( 0,830688239938357 )
J Chem Inf Model - Scaffold-focused virtual screening: prospective application to the discovery of TTK inhibitors. ( 0,830653646849027 )
J Integr Bioinform - Database supported candidate search for metabolite identification. ( 0,828402691353085 )
J Chem Inf Model - De novo design of drug-like molecules by a fragment-based molecular evolutionary approach. ( 0,825091172435677 )
J Chem Inf Model - Design of a three-dimensional multitarget activity landscape. ( 0,823181937311485 )
J Chem Inf Model - Novel mycosin protease MycP1 inhibitors identified by virtual screening and 4D fingerprints. ( 0,822429712039495 )
J Chem Inf Model - Introduction of target cliffs as a concept to identify and describe complex molecular selectivity patterns. ( 0,822289614070943 )
J Chem Inf Model - How do 2D fingerprints detect structurally diverse active compounds? Revealing compound subset-specific fingerprint features through systematic selection. ( 0,821399156465596 )
J Chem Inf Model - Mining for bioactive scaffolds with scaffold networks: improved compound set enrichment from primary screening data. ( 0,820297324893535 )
J Chem Inf Model - Using novel descriptor accounting for ligand-receptor interactions to define and visually explore biologically relevant chemical space. ( 0,819138863775924 )
J Chem Inf Model - Rationalizing the role of SAR tolerance for ligand-based virtual screening. ( 0,815466011173382 )
J Chem Inf Model - Structural similarity based kriging for quantitative structure activity and property relationship modeling. ( 0,814706669223683 )
J Chem Inf Model - Discovery of a7-nicotinic receptor ligands by virtual screening of the chemical universe database GDB-13. ( 0,813181848079924 )
J Chem Inf Model - Selection of in silico drug screening results for G-protein-coupled receptors by using universal active probes. ( 0,812969985731773 )
J Chem Inf Model - Construction and use of fragment-augmented molecular Hasse diagrams. ( 0,812909017924814 )
J Chem Inf Model - Discovery of chemical compound groups with common structures by a network analysis approach (affinity prediction method). ( 0,812782055894648 )
J Chem Inf Model - A new protocol for predicting novel GSK-3? ATP competitive inhibitors. ( 0,812627788591874 )
J Chem Inf Model - Identification of novel serotonin transporter compounds by virtual screening. ( 0,812572082959378 )
J Chem Inf Model - Capturing structure-activity relationships from chemogenomic spaces. ( 0,811120261400818 )
J Chem Inf Model - Integrating medicinal chemistry, organic/combinatorial chemistry, and computational chemistry for the discovery of selective estrogen receptor modulators with Forecaster, a novel platform for drug discovery. ( 0,811070329026378 )
J Chem Inf Model - Identification of a new class of FtsZ inhibitors by structure-based design and in vitro screening. ( 0,808873893224351 )
J Chem Inf Model - Neighborhood-based prediction of novel active compounds from SAR matrices. ( 0,808825289002609 )
J Chem Inf Model - Hsp90 inhibitors, part 2: combining ligand-based and structure-based approaches for virtual screening application. ( 0,805797858210132 )
J Chem Inf Model - Automated design of realistic organometallic molecules from fragments. ( 0,805038394949001 )
J Chem Inf Model - Novel method for pharmacophore analysis by examining the joint pharmacophore space. ( 0,804257761237187 )
J Am Med Inform Assoc - Drug repurposing: mining protozoan proteomes for targets of known bioactive compounds. ( 0,803923834006689 )
J Chem Inf Model - ColBioS-FlavRC: a collection of bioselective flavonoids and related compounds filtered from high-throughput screening outcomes. ( 0,802813151366211 )
J Chem Inf Model - Scanning structure-activity relationships with structure-activity similarity and related maps: from consensus activity cliffs to selectivity switches. ( 0,802732420570735 )
J Chem Inf Model - Rapid scanning structure-activity relationships in combinatorial data sets: identification of activity switches. ( 0,801742893205624 )
J Chem Inf Model - SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules. ( 0,80155878160886 )
J Chem Inf Model - Feasibility of using molecular docking-based virtual screening for searching dual target kinase inhibitors. ( 0,801048074028271 )
J Chem Inf Model - Identification of novel potential antibiotics against Staphylococcus using structure-based drug screening targeting dihydrofolate reductase. ( 0,800118531921523 )
J Chem Inf Model - Freely available conformer generation methods: how good are they? ( 0,798737254565998 )
J Chem Inf Model - Multitarget structure-activity relationships characterized by activity-difference maps and consensus similarity measure. ( 0,798691434071553 )
J Chem Inf Model - A searchable map of PubChem. ( 0,798544084782759 )
J Chem Inf Model - Structure based model for the prediction of phospholipidosis induction potential of small molecules. ( 0,798471854363192 )
J Chem Inf Model - Visualization and virtual screening of the chemical universe database GDB-17. ( 0,796307936078028 )
J Chem Inf Model - Identification of sumoylation activating enzyme 1 inhibitors by structure-based virtual screening. ( 0,795609975682617 )
J Chem Inf Model - Fragment-based lead discovery and design. ( 0,794536430656707 )
J Chem Inf Model - AlzPlatform: an Alzheimer's disease domain-specific chemogenomics knowledgebase for polypharmacology and target identification research. ( 0,794384787909802 )
J Chem Inf Model - Optimization of molecular representativeness. ( 0,793585275961375 )
J Chem Inf Model - Identification of compounds with potential antibacterial activity against Mycobacterium through structure-based drug screening. ( 0,793152815830752 )
Comput Biol Chem - The optimization of running time for a maximum common substructure-based algorithm and its application in drug design. ( 0,79178291967186 )
J Chem Inf Model - SABRE: ligand/structure-based virtual screening approach using consensus molecular-shape pattern recognition. ( 0,791476881286568 )
J Chem Inf Model - Similarity searching for potent compounds using feature selection. ( 0,789125917122891 )
J Chem Inf Model - Discovery of new selective human aldose reductase inhibitors through virtual screening multiple binding pocket conformations. ( 0,788004593667697 )
J Chem Inf Model - Mining chemical reactions using neighborhood behavior and condensed graphs of reactions approaches. ( 0,787897800393211 )
J Chem Inf Model - Enrichment of chemical libraries docked to protein conformational ensembles and application to aldehyde dehydrogenase 2. ( 0,78735349256421 )
J Chem Inf Model - Prediction of new bioactive molecules using a Bayesian belief network. ( 0,787245747910438 )
J Chem Inf Model - Plane of best fit: a novel method to characterize the three-dimensionality of molecules. ( 0,786502695260785 )
J Chem Inf Model - Identification of descriptors capturing compound class-specific features by mutual information analysis. ( 0,786093689880197 )
J Chem Inf Model - SAR monitoring of evolving compound data sets using activity landscapes. ( 0,785775461915019 )
J Chem Inf Model - Shaping a screening file for maximal lead discovery efficiency and effectiveness: elimination of molecular redundancy. ( 0,785201611598124 )
J Chem Inf Model - Discovery of inhibitors of Schistosoma mansoni HDAC8 by combining homology modeling, virtual screening, and in vitro validation. ( 0,785016115224572 )
J Chem Inf Model - Discovery of novel antimalarial compounds enabled by QSAR-based virtual screening. ( 0,783685489121843 )