J Chem Inf Model - Large-scale similarity search profiling of ChEMBL compound data sets.

Tópicos

{ search(2224) databas(1162) retriev(909) }
{ method(1219) similar(1157) match(930) }
{ compound(1573) activ(1297) structur(1058) }
{ can(981) present(881) function(850) }
{ learn(2355) train(1041) set(1003) }
{ control(1307) perform(991) simul(935) }
{ drug(1928) target(777) effect(648) }
{ research(1085) discuss(1038) issu(1018) }
{ use(976) code(926) identifi(902) }
{ activ(1452) weight(1219) physic(1104) }
{ studi(1410) differ(1259) use(1210) }
{ implement(1333) system(1263) develop(1122) }
{ measur(2081) correl(1212) valu(896) }
{ network(2748) neural(1063) input(814) }
{ take(945) account(800) differ(722) }
{ import(1318) role(1303) understand(862) }
{ studi(1119) effect(1106) posit(819) }
{ high(1669) rate(1365) level(1280) }
{ use(1733) differ(960) four(931) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ design(1359) user(1324) use(1319) }
{ featur(1941) imag(1645) propos(1176) }
{ howev(809) still(633) remain(590) }
{ system(1050) medic(1026) inform(1018) }
{ monitor(1329) mobil(1314) devic(1160) }
{ state(1844) use(1261) util(961) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ process(1125) use(805) approach(778) }
{ model(3404) distribut(989) bayesian(671) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ health(1844) social(1437) communiti(874) }
{ cancer(2502) breast(956) screen(824) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

A large-scale similarity search investigation has been carried out on 266 well-defined compound activity classes extracted from the ChEMBL database. The analysis was performed using two widely applied two-dimensional (2D) fingerprints that mark opposite ends of the current performance spectrum of these types of fingerprints, i.e., MACCS structural keys and the extended connectivity fingerprint with bond diameter four (ECFP4). For each fingerprint, three nearest neighbor search strategies were applied. On the basis of these search calculations, a similarity search profile of the ChEMBL database was generated. Overall, the fingerprint search campaign was surprisingly successful. In 203 of 266 test cases (~76%), a compound recovery rate of at least 50% was observed with at least the better performing fingerprint and one search strategy. The similarity search profile also revealed several general trends. For example, fingerprint searching was often characterized by an early enrichment of active compounds in database selection sets. In addition, compound activity classes have been categorized according to different similarity search performance levels, which helps to put the results of benchmark calculations into perspective. Therefore, a compendium of activity classes falling into different search performance categories is provided. On the basis of our large-scale investigation, the performance range of state-of-the-art 2D fingerprinting has been delineated for compound data sets directed against a wide spectrum of pharmaceutical targets.

Resumo Limpo

largescal similar search investig carri welldefin compound activ class extract chembl databas analysi perform use two wide appli twodimension d fingerprint mark opposit end current perform spectrum type fingerprint ie macc structur key extend connect fingerprint bond diamet four ecfp fingerprint three nearest neighbor search strategi appli basi search calcul similar search profil chembl databas generat overal fingerprint search campaign surpris success test case compound recoveri rate least observ least better perform fingerprint one search strategi similar search profil also reveal sever general trend exampl fingerprint search often character earli enrich activ compound databas select set addit compound activ class categor accord differ similar search perform level help put result benchmark calcul perspect therefor compendium activ class fall differ search perform categori provid basi largescal investig perform rang stateoftheart d fingerprint delin compound data set direct wide spectrum pharmaceut target

Resumos Similares

J Chem Inf Model - Scaffold hopping by fragment replacement. ( 0,72196861663153 )
J Chem Inf Model - In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Na?ve Bayes and Parzen-Rosenblatt window. ( 0,645785064199461 )
J Chem Inf Model - Do not hesitate to use Tversky-and other hints for successful active analogue searches with feature count descriptors. ( 0,640678354924493 )
J Chem Inf Model - Speeding up chemical searches using the inverted index: the convergence of chemoinformatics and text search methods. ( 0,635204084976778 )
J Biomed Inform - A semi-supervised approach to extract pharmacogenomics-specific drug-gene pairs from biomedical literature for personalized medicine. ( 0,623539573850824 )
Methods Inf Med - Developing topic-specific search filters for PubMed with click-through data. ( 0,622206803912798 )
AMIA Annu Symp Proc - Using Co-Authoring and Cross-Referencing Information for MEDLINE Indexing. ( 0,617760313231844 )
J Chem Inf Model - Identification of descriptors capturing compound class-specific features by mutual information analysis. ( 0,61666828787015 )
J Integr Bioinform - Classification methods for finding articles describing protein-protein interactions in PubMed. ( 0,616244872079253 )
Methods Inf Med - Learning the preferences of physicians for the organization of result lists of medical evidence articles. ( 0,616130250802652 )
Health Info Libr J - Utilisation of search filters in systematic reviews of prognosis questions. ( 0,615551137243915 )
J Biomed Inform - On the query reformulation technique for effective MEDLINE document retrieval. ( 0,6145833104364 )
BMC Med Inform Decis Mak - BOSS: context-enhanced search for biomedical objects. ( 0,612884049707185 )
J Chem Inf Model - Maximum-score diversity selection for early drug discovery. ( 0,611762682110084 )
J Biomed Inform - MeSHy: Mining unanticipated PubMed information using frequencies of occurrences and concurrences of MeSH terms. ( 0,610535695038379 )
J Chem Inf Model - How do 2D fingerprints detect structurally diverse active compounds? Revealing compound subset-specific fingerprint features through systematic selection. ( 0,605817628163745 )
J Biomed Inform - Using statistical text mining to supplement the development of an ontology. ( 0,604059175854786 )
J Chem Inf Model - Hit expansion approaches using multiple similarity methods and virtualized query structures. ( 0,603097192580048 )
J Integr Bioinform - The LAILAPS search engine: a feature model for relevance ranking in life science databases. ( 0,602493369997158 )
Health Info Libr J - Developing a geographic search filter to identify randomised controlled trials in Africa: finding the optimal balance between sensitivity and precision. ( 0,601246244827393 )
J Chem Inf Model - ReverseScreen3D: a structure-based ligand matching method to identify protein targets. ( 0,600787022520392 )
J Chem Inf Model - Chemical and biological properties of frequent screening hits. ( 0,597337618897523 )
AMIA Annu Symp Proc - Search filter precision can be improved by NOTing out irrelevant content. ( 0,59698366019166 )
J Am Med Inform Assoc - A practical approach to achieve private medical record linkage in light of public resources. ( 0,596476810378321 )
Health Info Libr J - Medical literature searches: a comparison of PubMed and Google Scholar. ( 0,592819790888068 )
J Chem Inf Model - Efficient substructure searching of large chemical libraries: the ABCD chemical cartridge. ( 0,591183688094979 )
J Chem Inf Model - Noncontiguous atom matching structural similarity function. ( 0,590921630143061 )
BMC Med Inform Decis Mak - Glomerular disease search filters for Pubmed, Ovid Medline, and Embase: a development and validation study. ( 0,589780591136499 )
J. Med. Internet Res. - Cumulative query method for influenza surveillance using search engine data. ( 0,587996724683617 )
J. Med. Internet Res. - Development and validation of filters for the retrieval of studies of clinical examination from Medline. ( 0,586019667502411 )
Telemed J E Health - MEDLINE versus EMBASE and CINAHL for telemedicine searches. ( 0,584007831094269 )
Health Info Libr J - The performance of adverse effects search filters in MEDLINE and EMBASE. ( 0,583120864890362 )
J Am Med Inform Assoc - Retrieval of diagnostic and treatment studies for clinical use through PubMed and PubMed's Clinical Queries filters. ( 0,58258287279639 )
J. Med. Internet Res. - Retrieving clinical evidence: a comparison of PubMed and Google Scholar for quick clinical searches. ( 0,582067941741416 )
Health Info Libr J - Sensitivity and precision of adverse effects search filters in MEDLINE and EMBASE: a case study of fractures with thiazolidinediones. ( 0,581183702795584 )
AMIA Annu Symp Proc - Evaluation of automated term groupings for detecting anaphylactic shock signals for drugs. ( 0,579934176185644 )
J Am Med Inform Assoc - A literature search tool for intelligent extraction of disease-associated genes. ( 0,579087841568639 )
J Chem Inf Model - A system for encoding and searching Markush structures. ( 0,578510120632267 )
J Integr Bioinform - The LAILAPS search engine: relevance ranking in life science databases. ( 0,576877307742818 )
J Chem Inf Model - Comparison of confirmed inactive and randomly selected compounds as negative training examples in support vector machine-based virtual screening. ( 0,576386371004816 )
J Chem Inf Model - Development of a comprehensive, validated pharmacophore hypothesis for anthrax toxin lethal factor (LF) inhibitors using genetic algorithms, Pareto scoring, and structural biology. ( 0,575539207281068 )
J Chem Inf Model - Using novel descriptor accounting for ligand-receptor interactions to define and visually explore biologically relevant chemical space. ( 0,575175244747065 )
J Chem Inf Model - Searching for substructures in fragment spaces. ( 0,573515917847958 )
BMC Med Inform Decis Mak - Performance evaluation of Unified Medical Language System?'s synonyms expansion to query PubMed. ( 0,573381442119535 )
J Chem Inf Model - Mining chemical reactions using neighborhood behavior and condensed graphs of reactions approaches. ( 0,57216878272969 )
BMC Med Inform Decis Mak - Boolean versus ranked querying for biomedical systematic reviews. ( 0,568737981378852 )
Brief. Bioinformatics - Fast and efficient searching of biological data resources--using EB-eye. ( 0,568415595521224 )
J Am Med Inform Assoc - Search filters to identify geriatric medicine in Medline. ( 0,567917276436331 )
J. Med. Internet Res. - Sensitivity and predictive value of 15 PubMed search strategies to answer clinical questions rated against full systematic reviews. ( 0,5667624843779 )
Health Info Libr J - Searching for randomised controlled trials and clinical controlled trials in Thai online bibliographical biomedical databases. ( 0,564116878030722 )
Methods Inf Med - A survey on visual information search behavior and requirements of radiologists. ( 0,563975359478669 )
J Chem Inf Model - SimG: an alignment based method for evaluating the similarity of small molecules and binding sites. ( 0,560389119845485 )
J. Med. Internet Res. - Searching for truth: internet search patterns as a method of investigating online responses to a Russian illicit drug policy debate. ( 0,559496954704643 )
J Chem Inf Model - SHAFTS: a hybrid approach for 3D molecular similarity calculation. 1. Method and assessment of virtual screening. ( 0,55923278940108 )
Int J Med Inform - MEDRank: using graph-based concept ranking to index biomedical texts. ( 0,558609847221827 )
Int J Health Geogr - HEALTH GeoJunction: place-time-concept browsing of health publications. ( 0,55812513056991 )
Brief. Bioinformatics - Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees. ( 0,556905656412206 )
BMC Med Inform Decis Mak - CDAPubMed: a browser extension to retrieve EHR-based biomedical literature. ( 0,556461646231631 )
J Am Med Inform Assoc - Federated queries of clinical data repositories: the sum of the parts does not equal the whole. ( 0,555515015503761 )
J Am Med Inform Assoc - PhenDisco: phenotype discovery system for the database of genotypes and phenotypes. ( 0,55391748952616 )
Int J Med Inform - An analysis of clinical queries in an electronic health record search utility. ( 0,551651097391783 )
J Chem Inf Model - Ligand- and structure-based virtual screening for clathrodin-derived human voltage-gated sodium channel modulators. ( 0,551322788818249 )
J Chem Inf Model - Searching for recursively defined generic chemical patterns in nonenumerated fragment spaces. ( 0,549412405062348 )
AMIA Annu Symp Proc - A bottom-up approach to MEDLINE indexing recommendations. ( 0,549155697542517 )
J Chem Inf Model - Subpocket analysis method for fragment-based drug discovery. ( 0,548887705864185 )
AMIA Annu Symp Proc - Finding and accessing diagrams in biomedical publications. ( 0,548825203357757 )
BMC Med Inform Decis Mak - Publication trends of shared decision making in 15 high impact medical journals: a full-text review with bibliometric analysis. ( 0,548256215376812 )
IEEE Trans Image Process - General subspace learning with corrupted training data via graph embedding. ( 0,54803743736819 )
J Am Med Inform Assoc - Search terms and a validated brief search filter to retrieve publications on health-related values in Medline: a word frequency analysis study. ( 0,547383092042523 )
Int J Med Inform - An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics. ( 0,547006824049732 )
Health Info Libr J - Assessment of indexing trends with specific and general terms for herbal medicine. ( 0,546806556387651 )
J. Med. Internet Res. - Using Internet search engines to obtain medical information: a comparative study. ( 0,546703229253424 )
Res Synth Methods - Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews. ( 0,546275366174883 )
J Chem Inf Model - MMP-Cliffs: systematic identification of activity cliffs on the basis of matched molecular pairs. ( 0,543825418882086 )
J Am Med Inform Assoc - MEDLINE clinical queries are robust when searching in recent publishing years. ( 0,543448783743185 )
J Biomed Inform - Development and evaluation of a biomedical search engine using a predicate-based vector space model. ( 0,541903329137695 )
J Chem Inf Model - CLCA: maximum common molecular substructure queries within the MetRxn database. ( 0,541827541827542 )
Health Info Libr J - Can we prioritise which databases to search? A case study using a systematic review of frozen shoulder management. ( 0,538926559243993 )
J Chem Inf Model - Similarity searching for potent compounds using feature selection. ( 0,538788152826328 )
AMIA Annu Symp Proc - BIOSPIDA: A Relational Database Translator for NCBI. ( 0,537674132268382 )
J Chem Inf Model - COSMOsim3D: 3D-similarity and alignment based on COSMO polarization charge densities. ( 0,536392181689042 )
IEEE Trans Pattern Anal Mach Intell - On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval. ( 0,536308871509686 )
J Chem Inf Model - Activity-aware clustering of high throughput screening data and elucidation of orthogonal structure-activity relationships. ( 0,53505764762938 )
Health Info Libr J - Facilitating access to evidence: Primary Health Care Search Filter. ( 0,534570875467584 )
J Integr Bioinform - A query suggestion workflow for life science IR-systems. ( 0,53422425109292 )
J. Med. Internet Res. - Definition of Health 2.0 and Medicine 2.0: a systematic review. ( 0,534213247411391 )
J Chem Inf Model - Characterizing the diversity and biological relevance of the MLPCN assay manifold and screening set. ( 0,533905856520503 )
Res Synth Methods - Comprehensive computer searches and reporting in systematic reviews. ( 0,53299116011666 )
AMIA Annu Symp Proc - Query log analysis of an electronic health record search engine. ( 0,531572606325084 )
J Chem Inf Model - Bioturbo similarity searching: combining chemical and biological similarity to discover structurally diverse bioactive molecules. ( 0,530588676794195 )
J Chem Inf Model - Identifying compound-target associations by combining bioactivity profile similarity search and public databases mining. ( 0,529700927651212 )
IEEE Trans Image Process - Image search reranking with query-dependent click-based relevance feedback. ( 0,529184205067636 )
J Chem Inf Model - An integrated virtual screening approach for VEGFR-2 inhibitors. ( 0,528264290100358 )
J Biomed Inform - Reflective random indexing for semi-automatic indexing of the biomedical literature. ( 0,527796303934329 )
J Chem Inf Model - SymDex: increasing the efficiency of chemical fingerprint similarity searches for comparing large chemical libraries by using query set indexing. ( 0,527689654954046 )
J Chem Inf Model - Application of the 4D fingerprint method with a robust scoring function for scaffold-hopping and drug repurposing strategies. ( 0,526961095078683 )
J Telemed Telecare - How to improve your PubMed/MEDLINE searches: 1. background and basic searching. ( 0,526767356242777 )
J Chem Inf Model - Development of Ecom50 and retention index models for nontargeted metabolomics: identification of 1,3-dicyclohexylurea in human serum by HPLC/mass spectrometry. ( 0,526280187209666 )
AMIA Annu Symp Proc - Does query expansion limit our learning? A comparison of social-based expansion to content-based expansion for medical queries on the internet. ( 0,525547446211721 )
J Chem Inf Model - Systematic identification of scaffolds representing compounds active against individual targets and single or multiple target families. ( 0,524815796162488 )