J Chem Inf Model - Exploiting structural information in patent specifications for key compound prediction.

Tópicos

{ method(1219) similar(1157) match(930) }
{ compound(1573) activ(1297) structur(1058) }
{ state(1844) use(1261) util(961) }
{ analysi(2126) use(1163) compon(1037) }
{ sequenc(1873) structur(1644) protein(1328) }
{ extract(1171) text(1153) clinic(932) }
{ import(1318) role(1303) understand(862) }
{ model(2656) set(1616) predict(1553) }
{ design(1359) user(1324) use(1319) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ framework(1458) process(801) describ(734) }
{ learn(2355) train(1041) set(1003) }
{ data(3963) clinic(1234) research(1004) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ process(1125) use(805) approach(778) }
{ method(1969) cluster(1462) data(1082) }
{ featur(3375) classif(2383) classifi(1994) }
{ patient(2315) diseas(1263) diabet(1191) }
{ perform(1367) use(1326) method(1137) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ patient(1821) servic(1111) care(1106) }
{ decis(3086) make(1611) patient(1517) }
{ method(2212) result(1239) propos(1039) }
{ can(774) often(719) complex(702) }
{ imag(1057) registr(996) error(939) }
{ motion(1329) object(1292) video(1091) }
{ problem(2511) optim(1539) algorithm(950) }
{ general(901) number(790) one(736) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ studi(1119) effect(1106) posit(819) }
{ research(1218) medic(880) student(794) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ result(1111) use(1088) new(759) }
{ detect(2391) sensit(1101) algorithm(908) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ patient(2837) hospit(1953) medic(668) }
{ age(1611) year(1155) adult(843) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ activ(1452) weight(1219) physic(1104) }

Resumo

Patent specifications are one of many information sources needed to progress drug discovery projects. Understanding compound prior art and novelty checking, validation of biological assays, and identification of new starting points for chemical explorations are a few areas where patent analysis is an important component. Cheminformatics methods can be used to facilitate the identification of so-called key compounds in patent specifications. Such methods, relying on structural information extracted from documents by expert curation or text mining, can complement or in some cases replace the traditional manual approach of searching for clues in the text. This paper describes and compares three different methods for the automatic prediction of key compounds in patent specifications using structural information alone. For this data set, the cluster seed analysis described by Hattori et al. (Hattori, K.; Wakabayashi, H.; Tamaki, K. Predicting key example compounds in competitors' patent applications using structural information alone. J. Chem. Inf. Model.2008, 48, 135-142) is superior in terms of prediction accuracy with 26 out of 48 drugs (54%) correctly predicted from their corresponding patents. Nevertheless, the two new methods, based on frequency of R-groups (FOG) and maximum common substructure (MCS) similarity measures, show significant advantages due to their inherent ability to visualize relevant structural features. The results of the FOG method can be enhanced by manual selection of the scaffolds used in the analysis. Finally, a successful example of applying FOG analysis for designing potent ATP-competitive AXL kinase inhibitors with improved properties is described.

Resumo Limpo

patent specif one mani inform sourc need progress drug discoveri project understand compound prior art novelti check valid biolog assay identif new start point chemic explor area patent analysi import compon cheminformat method can use facilit identif socal key compound patent specif method reli structur inform extract document expert curat text mine can complement case replac tradit manual approach search clue text paper describ compar three differ method automat predict key compound patent specif use structur inform alon data set cluster seed analysi describ hattori et al hattori k wakabayashi h tamaki k predict key exampl compound competitor patent applic use structur inform alon j chem inf model superior term predict accuraci drug correct predict correspond patent nevertheless two new method base frequenc rgroup fog maximum common substructur mcs similar measur show signific advantag due inher abil visual relev structur featur result fog method can enhanc manual select scaffold use analysi final success exampl appli fog analysi design potent atpcompetit axl kinas inhibitor improv properti describ

Resumos Similares

J Chem Inf Model - COSMOsim3D: 3D-similarity and alignment based on COSMO polarization charge densities. ( 0,673785913900321 )
J Chem Inf Model - Quantitative structure-activity relationship models of chemical transformations from matched pairs analyses. ( 0,666329494942148 )
J Chem Inf Model - Noncontiguous atom matching structural similarity function. ( 0,659403696206657 )
J Chem Inf Model - MMP-Cliffs: systematic identification of activity cliffs on the basis of matched molecular pairs. ( 0,65242649815129 )
J Chem Inf Model - Structural similarity based kriging for quantitative structure activity and property relationship modeling. ( 0,626931330917559 )
J Chem Inf Model - Build-up algorithm for atomic correspondence between chemical structures. ( 0,626544476084268 )
J Biomed Inform - Mining connections between chemicals, proteins, and diseases extracted from Medline annotations. ( 0,613726329523043 )
J Chem Inf Model - In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Na?ve Bayes and Parzen-Rosenblatt window. ( 0,613348486688693 )
Comput Biol Chem - Understanding the general packing rearrangements required for successful template based modeling of protein structure from a CASP experiment. ( 0,613256743654416 )
J Chem Inf Model - Design of novel FLT-3 inhibitors based on dual-layer 3D-QSAR model and fragment-based compounds in silico. ( 0,612838243086533 )
J. Comput. Biol. - Separating significant matches from spurious matches in DNA sequences. ( 0,611300378070012 )
J Chem Inf Model - Systematic assessment of compound series with SAR transfer potential. ( 0,611233521363479 )
J Chem Inf Model - Using information from historical high-throughput screens to predict active compounds. ( 0,596055950587433 )
J Chem Inf Model - An integrated virtual screening approach for VEGFR-2 inhibitors. ( 0,595990045741133 )
J Chem Inf Model - SHAFTS: a hybrid approach for 3D molecular similarity calculation. 1. Method and assessment of virtual screening. ( 0,587563914298944 )
J Chem Inf Model - Virtual drug screen schema based on multiview similarity integration and ranking aggregation. ( 0,584681491811385 )
J Chem Inf Model - Discovery of novel Pim-1 kinase inhibitors by a hierarchical multistage virtual screening approach based on SVM model, pharmacophore, and molecular docking. ( 0,582187605525327 )
J Chem Inf Model - Development of Ecom50 and retention index models for nontargeted metabolomics: identification of 1,3-dicyclohexylurea in human serum by HPLC/mass spectrometry. ( 0,581359249817316 )
J Chem Inf Model - Consensus models of activity landscapes with multiple chemical, conformer, and property representations. ( 0,57698174633287 )
J Chem Inf Model - Computational derivation of structural alerts from large toxicology data sets. ( 0,573523865878571 )
J Chem Inf Model - Maximum-score diversity selection for early drug discovery. ( 0,572965054949745 )
J Chem Inf Model - Activity-aware clustering of high throughput screening data and elucidation of orthogonal structure-activity relationships. ( 0,572493943116675 )
J Chem Inf Model - Profile-QSAR and Surrogate AutoShim protein-family modeling of proteases. ( 0,57081122494698 )
J Chem Inf Model - Prediction of aquatic toxicity mode of action using linear discriminant and random forest models. ( 0,56899534649913 )
J Chem Inf Model - Design of multitarget activity landscapes that capture hierarchical activity cliff distributions. ( 0,568582626352566 )
J Chem Inf Model - Visualization and virtual screening of the chemical universe database GDB-17. ( 0,566776667361151 )
J Chem Inf Model - Large-scale mining for similar protein binding pockets: with RAPMAD retrieval on the fly becomes real. ( 0,566435604145035 )
J Chem Inf Model - Mapping monomeric threading to protein-protein structure prediction. ( 0,565833470685416 )
J Chem Inf Model - SimG: an alignment based method for evaluating the similarity of small molecules and binding sites. ( 0,564258575247777 )
J Chem Inf Model - Systematic identification of scaffolds representing compounds active against individual targets and single or multiple target families. ( 0,564014869814749 )
J Chem Inf Model - ColBioS-FlavRC: a collection of bioselective flavonoids and related compounds filtered from high-throughput screening outcomes. ( 0,563860774405522 )
Comput. Biol. Med. - Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches. ( 0,562514799303097 )
J Chem Inf Model - Reading PDB: perception of molecules from 3D atomic coordinates. ( 0,560281609981465 )
J Chem Inf Model - A searchable map of PubChem. ( 0,559769272738383 )
J Chem Inf Model - Hit expansion approaches using multiple similarity methods and virtualized query structures. ( 0,559589872504862 )
J Chem Inf Model - Fragment-based lead discovery and design. ( 0,55929848162414 )
J Chem Inf Model - Information-theoretic approach for the discovery of design rules for crystal chemistry. ( 0,558999451617448 )
J Chem Inf Model - Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity. ( 0,558669583824241 )
J Chem Inf Model - A new protocol for predicting novel GSK-3? ATP competitive inhibitors. ( 0,558423893940333 )
J Chem Inf Model - Prospects for tertiary structure prediction of RNA based on secondary structure information. ( 0,55570666430144 )
J Chem Inf Model - Combinatorial ? computational ? cheminformatics (C3) approach to characterization of congeneric libraries of organic pollutants. ( 0,554983134369731 )
J Chem Inf Model - Prediction of compound potency changes in matched molecular pairs using support vector regression. ( 0,55403759927031 )
J Chem Inf Model - Conditional probabilistic analysis for prediction of the activity landscape and relative compound activities. ( 0,551956364005615 )
J Chem Inf Model - Chemical structure elucidation from ??C NMR chemical shifts: efficient data processing using bipartite matching and maximal clique algorithms. ( 0,551365821420818 )
J Chem Inf Model - Best of both worlds: on the complementarity of ligand-based and structure-based virtual screening. ( 0,55134368649102 )
J Chem Inf Model - Predicting myelosuppression of drugs from in silico models. ( 0,548728214960756 )
J Chem Inf Model - Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. ( 0,548532908100616 )
J Chem Inf Model - Structural determinants of drug partitioning in n-hexadecane/water system. ( 0,54796509871355 )
J Chem Inf Model - Visual characterization and diversity quantification of chemical libraries: 2. Analysis and selection of size-independent, subspace-specific diversity indices. ( 0,547390282067822 )
J Chem Inf Model - PyDPI: freely available python package for chemoinformatics, bioinformatics, and chemogenomics studies. ( 0,544040329906156 )
J Chem Inf Model - Ligand-based target prediction with signature fingerprints. ( 0,542477092099983 )
J Chem Inf Model - Extraction of discontinuous structure-activity relationships from compound data sets through particle swarm optimization. ( 0,541952798710108 )
J Chem Inf Model - Construction and use of fragment-augmented molecular Hasse diagrams. ( 0,540845534582593 )
J Chem Inf Model - Heterogeneous classifier fusion for ligand-based virtual screening: or, how decision making by committee can be a good thing. ( 0,540659374575779 )
J Chem Inf Model - Visualization of molecular fingerprints. ( 0,539258264427727 )
J Chem Inf Model - Prediction of new bioactive molecules using a Bayesian belief network. ( 0,538747981743339 )
J Chem Inf Model - Hsp90 inhibitors, part 2: combining ligand-based and structure-based approaches for virtual screening application. ( 0,538002782660885 )
J Chem Inf Model - Automating knowledge discovery for toxicity prediction using jumping emerging pattern mining. ( 0,535672913662534 )
J Chem Inf Model - Ligand- and structure-based virtual screening for clathrodin-derived human voltage-gated sodium channel modulators. ( 0,535636843571493 )
J Chem Inf Model - Library enhancement through the wisdom of crowds. ( 0,535490141451236 )
J Chem Inf Model - SABRE: ligand/structure-based virtual screening approach using consensus molecular-shape pattern recognition. ( 0,535197532529602 )
IEEE Trans Image Process - General subspace learning with corrupted training data via graph embedding. ( 0,535078193460139 )
J Chem Inf Model - Large-scale assessment of activity landscape feature probabilities of bioactive compounds. ( 0,534932707031967 )
J Chem Inf Model - Assay Related Target Similarity (ARTS) - chemogenomics approach for quantitative comparison of biological targets. ( 0,534545879667789 )
IEEE Trans Pattern Anal Mach Intell - A Prototype Learning Framework Using EMD: Application to Complex Scenes Analysis. ( 0,533478335494644 )
J Chem Inf Model - Fighting obesity with a sugar-based library: discovery of novel MCH-1R antagonists by a new computational-VAST approach for exploration of GPCR binding sites. ( 0,533026896064539 )
J Chem Inf Model - From activity cliffs to activity ridges: informative data structures for SAR analysis. ( 0,53237948550415 )
J Chem Inf Model - Improving similarity-driven library design: customized matching and regioselective feature trees. ( 0,531602833644806 )
J Chem Inf Model - Discovery of new selective human aldose reductase inhibitors through virtual screening multiple binding pocket conformations. ( 0,531342982947539 )
Brief. Bioinformatics - Toward more realistic drug-target interaction predictions. ( 0,530983551437535 )
J Chem Inf Model - Structure based model for the prediction of phospholipidosis induction potential of small molecules. ( 0,530366128062293 )
J Chem Inf Model - Determination of toxicant mode of action by augmented top priority fragment class. ( 0,529669472666229 )
J Chem Inf Model - Identification of descriptors capturing compound class-specific features by mutual information analysis. ( 0,528955066831273 )
J Chem Inf Model - Modeling drug-induced anorexia by molecular topology. ( 0,52877759236205 )
J Chem Inf Model - Exploration of 3D activity cliffs on the basis of compound binding modes and comparison of 2D and 3D cliffs. ( 0,528535492525658 )
J Chem Inf Model - Biologically relevant chemical space navigator: from patent and structure-activity relationship analysis to library acquisition and design. ( 0,526670151827881 )
Comput Math Methods Med - Identification of antioxidants from sequence information using na?ve Bayes. ( 0,526072086583413 )
J Chem Inf Model - Compound optimization through data set-dependent chemical transformations. ( 0,523777100228549 )
J Chem Inf Model - Molecular dynamics-based virtual screening: accelerating the drug discovery process by high-performance computing. ( 0,522339263016156 )
J Chem Inf Model - Assessing molecular docking tools for relative biological activity prediction: a case study of triazole HIV-1 NNRTIs. ( 0,52212158004448 )
J Chem Inf Model - Exploring uncharted territories: predicting activity cliffs in structure-activity landscapes. ( 0,521168453049528 )
J Chem Inf Model - ReverseScreen3D: a structure-based ligand matching method to identify protein targets. ( 0,520869884848855 )
J Chem Inf Model - Conserved core substructures in the overlay of protein-ligand complexes. ( 0,520331355333546 )
J Chem Inf Model - Prediction of activity cliffs using support vector machines. ( 0,519949454314245 )
J Chem Inf Model - Template CoMFA applied to 116 biological targets. ( 0,519901805598649 )
J Chem Inf Model - Neighborhood-based prediction of novel active compounds from SAR matrices. ( 0,517772572331938 )
J Chem Inf Model - admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties. ( 0,517233357965959 )
J Chem Inf Model - Structure-based design and screen of novel inhibitors for class II 3-hydroxy-3-methylglutaryl coenzyme A reductase from Streptococcus pneumoniae. ( 0,51690388956297 )
J. Comput. Biol. - Optimization of combinatorial mutagenesis. ( 0,516737646627164 )
Comput Math Methods Med - Image segmentation and identification of paired antibodies in breast tissue. ( 0,516466095398096 )
Int J Comput Assist Radiol Surg - Multi-contrast unbiased MRI atlas of a Parkinson's disease population. ( 0,515923895595469 )
J Chem Inf Model - NMR spectroscopy-based metabolic profiling of drug-induced changes in vitro can discriminate between pharmacological classes. ( 0,515833386551312 )
Comput. Biol. Med. - A bilateral analysis scheme for false positive reduction in mammogram mass detection. ( 0,515425049118988 )
J Chem Inf Model - Rapid scanning structure-activity relationships in combinatorial data sets: identification of activity switches. ( 0,515122506345682 )
J Chem Inf Model - Insights into molecular basis of cytochrome p450 inhibitory promiscuity of compounds. ( 0,514634833782817 )
J Chem Inf Model - Identification of multitarget activity ridges in high-dimensional bioactivity spaces. ( 0,513973658892414 )
J Chem Inf Model - Discovery of novel histamine H4 and serotonin transporter ligands using the topological feature tree descriptor. ( 0,513333226679382 )
J Chem Inf Model - Jointly handling potency and toxicity of antimicrobial peptidomimetics by simple rules from desirability theory and chemoinformatics. ( 0,512554633509566 )
J Chem Inf Model - Navigating high-dimensional activity landscapes: design and application of the ligand-target differentiation map. ( 0,512440577314208 )
J Chem Inf Model - Fast protein binding site comparison via an index-based screening technology. ( 0,511774104485061 )