J Chem Inf Model - PocketAlign a novel algorithm for aligning binding sites in protein structures.

Tópicos

{ sequenc(1873) structur(1644) protein(1328) }
{ bind(1733) structur(1185) ligand(1036) }
{ imag(1947) propos(1133) code(1026) }
{ take(945) account(800) differ(722) }
{ method(1219) similar(1157) match(930) }
{ method(984) reconstruct(947) comput(926) }
{ featur(1941) imag(1645) propos(1176) }
{ perform(1367) use(1326) method(1137) }
{ data(1714) softwar(1251) tool(1186) }
{ import(1318) role(1303) understand(862) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ drug(1928) target(777) effect(648) }
{ extract(1171) text(1153) clinic(932) }
{ data(3963) clinic(1234) research(1004) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ data(1737) use(1416) pattern(1282) }
{ perform(999) metric(946) measur(919) }
{ compound(1573) activ(1297) structur(1058) }
{ first(2504) two(1366) second(1323) }
{ use(976) code(926) identifi(902) }
{ studi(2440) review(1878) systemat(933) }
{ control(1307) perform(991) simul(935) }
{ research(1218) medic(880) student(794) }
{ result(1111) use(1088) new(759) }
{ model(3404) distribut(989) bayesian(671) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ motion(1329) object(1292) video(1091) }
{ concept(1167) ontolog(924) domain(897) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ health(3367) inform(1360) care(1135) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }
{ can(774) often(719) complex(702) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ use(1733) differ(960) four(931) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }

Resumo

A fundamental task in bioinformatics involves a transfer of knowledge from one protein molecule onto another by way of recognizing similarities. Such similarities are obtained at different levels, that of sequence, whole fold, or important substructures. Comparison of binding sites is important to understand functional similarities among the proteins and also to understand drug cross-reactivities. Current methods in literature have their own merits and demerits, warranting exploration of newer concepts and algorithms, especially for large-scale comparisons and for obtaining accurate residue-wise mappings. Here, we report the development of a new algorithm, PocketAlign, for obtaining structural superpositions of binding sites. The software is available as a web-service at http://proline.physics.iisc.ernet.in/pocketalign/. The algorithm encodes shape descriptors in the form of geometric perspectives, supplemented by chemical group classification. The shape descriptor considers several perspectives with each residue as the focus and captures relative distribution of residues around it in a given site. Residue-wise pairings are computed by comparing the set of perspectives of the first site with that of the second, followed by a greedy approach that incrementally combines residue pairings into a mapping. The mappings in different frames are then evaluated by different metrics encoding the extent of alignment of individual geometric perspectives. Different initial seed alignments are computed, each subsequently extended by detecting consequential atomic alignments in a three-dimensional grid, and the best 500 stored in a database. Alignments are then ranked, and the top scoring alignments reported, which are then streamed into Pymol for visualization and analyses. The method is validated for accuracy and sensitivity and benchmarked against existing methods. An advantage of PocketAlign, as compared to some of the existing tools available for binding site comparison in literature, is that it explores different schemes for identifying an alignment thus has a better potential to capture similarities in ligand recognition abilities. PocketAlign, by finding a detailed alignment of a pair of sites, provides insights as to why two sites are similar and which set of residues and atoms contribute to the similarity.

Resumo Limpo

fundament task bioinformat involv transfer knowledg one protein molecul onto anoth way recogn similar similar obtain differ level sequenc whole fold import substructur comparison bind site import understand function similar among protein also understand drug crossreact current method literatur merit demerit warrant explor newer concept algorithm especi largescal comparison obtain accur residuewis map report develop new algorithm pocketalign obtain structur superposit bind site softwar avail webservic httpprolinephysicsiiscernetinpocketalign algorithm encod shape descriptor form geometr perspect supplement chemic group classif shape descriptor consid sever perspect residu focus captur relat distribut residu around given site residuewis pair comput compar set perspect first site second follow greedi approach increment combin residu pair map map differ frame evalu differ metric encod extent align individu geometr perspect differ initi seed align comput subsequ extend detect consequenti atom align threedimension grid best store databas align rank top score align report stream pymol visual analys method valid accuraci sensit benchmark exist method advantag pocketalign compar exist tool avail bind site comparison literatur explor differ scheme identifi align thus better potenti captur similar ligand recognit abil pocketalign find detail align pair site provid insight two site similar set residu atom contribut similar

Resumos Similares

J Chem Inf Model - Molecular determinants of Bim(BH3) peptide binding to pro-survival proteins. ( 0,755646029876367 )
J. Comput. Biol. - Sequence alignment of viral channel proteins with cellular ion channels. ( 0,749977306120845 )
Comput Biol Chem - ProSTRIP: A method to find similar structural repeats in three-dimensional protein structures. ( 0,741740937498691 )
Comput Biol Chem - Multi-nucleation and vectorial folding pathways of large helix protein. ( 0,733752923699017 )
J Chem Inf Model - Protein secondary structure classification revisited: processing DSSP information with PSSC. ( 0,72947860715687 )
J Chem Inf Model - TRAPP: a tool for analysis of transient binding pockets in proteins. ( 0,721440433781027 )
J Chem Inf Model - Tertiary structure prediction of RNA-RNA complexes using a secondary structure and fragment-based method. ( 0,721301160522228 )
Comput Math Methods Med - DV-curve representation of protein sequences and its application. ( 0,718132498717532 )
J. Comput. Biol. - Combinatorics of -structures. ( 0,714237077054857 )
J Chem Inf Model - Comparative analysis of threshold and tessellation methods for determining protein contacts. ( 0,714108792613252 )
Comput Biol Chem - ProCoCoA: A quantitative approach for analyzing protein core composition. ( 0,708218676875624 )
J Chem Inf Model - Proteins as sponges: a statistical journey along protein structure organization principles. ( 0,705110735251466 )
J Chem Inf Model - Ligand binding site detection by local structure alignment and its performance complementarity. ( 0,701824415469318 )
Comput. Biol. Med. - Remote homology detection incorporating the context of physicochemical properties. ( 0,701590290467295 )
Comput Biol Chem - Protein fold recognition based on functional domain composition. ( 0,700902050558873 )
Med Biol Eng Comput - The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: semi-supervised classification of class C GPCRs. ( 0,699695876818578 )
J Chem Inf Model - Functional prediction of binding pockets. ( 0,699433910479645 )
J Chem Inf Model - Sequence, structure, and active site analyses of p38 MAP kinase: exploiting DFG-out conformation as a strategy to design new type II leads. ( 0,697180936826534 )
Comput Biol Chem - Relationship between global structural parameters and Enzyme Commission hierarchy: implications for function prediction. ( 0,696994564717945 )
Brief. Bioinformatics - Systematic identification of Class I HDAC substrates. ( 0,694885207932482 )
Comput. Biol. Med. - Application of 2D graphic representation of protein sequence based on Huffman tree method. ( 0,691092554007375 )
J Chem Inf Model - Modules identification in protein structures: the topological and geometrical solutions. ( 0,691027365821218 )
Comput Biol Chem - Identification of putative and potential cross-reactive chickpea (Cicer arietinum) allergens through an in silico approach. ( 0,690219131113519 )
Comput. Biol. Med. - LRRsearch: An asynchronous server-based application for the prediction of leucine-rich repeat motifs and an integrative database of NOD-like receptors. ( 0,689781756920587 )
Comput Biol Chem - Computational insight into nitration of human myoglobin. ( 0,689247540344311 )
J. Comput. Biol. - ComB: SNP calling and mapping analysis for color and nucleotide space platforms. ( 0,688792314652484 )
J Chem Inf Model - Building a knowledge-based statistical potential by capturing high-order inter-residue interactions and its applications in protein secondary structure assessment. ( 0,688726397977321 )
Brief. Bioinformatics - A practical guide for the computational selection of residues to be experimentally characterized in protein families. ( 0,688137387763212 )
J Chem Inf Model - Protein secondary structure prediction with SPARROW. ( 0,687151426386961 )
Comput Biol Chem - The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam. ( 0,686552174877931 )
Brief. Bioinformatics - Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. ( 0,686434329823938 )
Brief. Bioinformatics - Identify drug repurposing candidates by mining the protein data bank. ( 0,685885484939685 )
J Integr Bioinform - Exceptional single strand DNA word symmetry: analysis of evolutionary potentialities. ( 0,68093417431298 )
J Integr Bioinform - Prediction of thioredoxin and glutaredoxin target proteins by identifying reversibly oxidized cysteinyl residues. ( 0,680678827443449 )
Comput Biol Chem - Predicting protein-RNA interaction amino acids using random forest based on submodularity subset selection. ( 0,679136464662079 )
Comput Biol Chem - Colonic amyloidosis, computational analysis of the major amyloidogenic species, Serum Amyloid A. ( 0,677860048912435 )
Curr Protoc Bioinformatics - An Overview of RNA Sequence Analyses: Structure Prediction, ncRNA Gene Identification, and RNAi Design. ( 0,676687971515875 )
J. Comput. Biol. - Optimization of profile-to-profile alignment parameters for one-dimensional threading. ( 0,676166479757197 )
Comput Methods Programs Biomed - DNA encoding for an efficient 'Omics processing. ( 0,675651562144357 )
Comput Math Methods Med - Identification of antioxidants from sequence information using na?ve Bayes. ( 0,674881905998771 )
J. Comput. Biol. - Nonparametric combinatorial sequence models. ( 0,674535145053655 )
Comput Biol Chem - Characterizing regions in the human genome unmappable by next-generation-sequencing at the read length of 1000 bases. ( 0,673877428798451 )
Med Biol Eng Comput - Enhanced spatio-temporal alignment of plantar pressure image sequences using B-splines. ( 0,673406875793948 )
Comput Biol Chem - Human-chimpanzee alignment: ortholog exponentials and paralog power laws. ( 0,673321414356001 )
J Chem Inf Model - Cavities tell more than sequences: exploring functional relationships of proteases via binding pockets. ( 0,672403267820971 )
Brief. Bioinformatics - De novo assembly of short sequence reads. ( 0,672217310032102 )
Comput Biol Chem - Analysis of sequence repeats of proteins in the PDB. ( 0,671392804261584 )
Comput Biol Chem - Identification and characterization of lysine-methylated sites on histones and non-histone proteins. ( 0,671260716738037 )
J Chem Inf Model - ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures. ( 0,670614161474023 )
Comput Biol Chem - A local average connectivity-based method for identifying essential proteins from the network level. ( 0,670493700070723 )
Comput. Biol. Med. - Prediction of protein functions based on function-function correlation relations. ( 0,669815427857728 )
J Am Med Inform Assoc - HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads. ( 0,669064768789364 )
Comput Biol Chem - A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses. ( 0,668808465295898 )
BMC Med Inform Decis Mak - Efficient protein structure search using indexing methods. ( 0,668400951381776 )
Comput Biol Chem - Computational determination of the orientation of a heat repeat-like domain of DNA-PKcs. ( 0,668050304087356 )
Brief. Bioinformatics - New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. ( 0,666825632248109 )
J. Comput. Biol. - Emergent protein folding modeled with evolved neural cellular automata using the 3D HP model. ( 0,666666932873275 )
J. Comput. Biol. - Tracing the most parsimonious indel history. ( 0,665398181264226 )
Comput Biol Chem - Identical sequence patterns in the ends of exons and introns of human protein-coding genes. ( 0,663378390848379 )
J. Comput. Biol. - Evaluating, comparing, and interpreting protein domain hierarchies. ( 0,662821443702046 )
Curr Protoc Bioinformatics - Using the MEROPS Database for Proteolytic Enzymes and Their Inhibitors and Substrates. ( 0,662706903888271 )
Comput Biol Chem - Protein folding simulations of 2D HP model by the genetic algorithm based on optimal secondary structures. ( 0,662213505985629 )
Comput. Biol. Med. - Improving protein complex classification accuracy using amino acid composition profile. ( 0,661594740731264 )
J Chem Inf Model - Protein structural statistics with PSS. ( 0,660600185899761 )
Comput Biol Chem - Analysis and recognition of the GAGA transcription factor binding sites in Drosophila genes. ( 0,659834249357933 )
Comput. Biol. Med. - Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets. ( 0,659774649225025 )
Comput. Biol. Med. - miRClassify: an advanced web server for miRNA family classification and annotation. ( 0,65932797286038 )
Comput. Biol. Med. - An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. ( 0,656055868373415 )
Comput Math Methods Med - Quad-PRE: a hybrid method to predict protein quaternary structure attributes. ( 0,65571517274649 )
Comput Math Methods Med - ProBLM web server: protein and membrane placement and orientation package. ( 0,655686874660443 )
Brief. Bioinformatics - Ortholog identification in the presence of domain architecture rearrangement. ( 0,655217978778198 )
Comput Biol Chem - Bacterial protein structures reveal phylum dependent divergence. ( 0,654489453460995 )
Comput. Biol. Med. - Improving protein secondary structure prediction using a multi-modal BP method. ( 0,654340074632166 )
Comput Biol Chem - Tri-peptide reference structures for the calculation of relative solvent accessible surface area in protein amino acid residues. ( 0,654259263017638 )
J. Comput. Biol. - Simultaneous alignment and folding of protein sequences. ( 0,653881081414577 )
Brief. Bioinformatics - BamView: visualizing and interpretation of next-generation sequencing read alignments. ( 0,653715543581674 )
J Integr Bioinform - Predicting protein distance maps according to physicochemical properties. ( 0,652737507636469 )
J. Comput. Biol. - Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. ( 0,652565455911756 )
Comput Biol Chem - Predicting protein-protein interactions using graph invariants and a neural network. ( 0,651567558919846 )
Comput Biol Chem - Statistical analysis and exposure status classification of transmembrane beta barrel residues. ( 0,648649112479531 )
Brief. Bioinformatics - Alpha shape and Delaunay triangulation in studies of protein-related interactions. ( 0,648389299908718 )
Comput Biol Chem - Protein function prediction using neighbor relativity in protein-protein interaction network. ( 0,647482034312288 )
J Chem Inf Model - Parallel and antiparallel ?-strands differ in amino acid composition and availability of short constituent sequences. ( 0,646745727538861 )
IEEE Trans Image Process - Low latency secondary transforms for intra/inter prediction residual. ( 0,64478330041612 )
J Chem Inf Model - Self-contained sequence representation: bridging the gap between bioinformatics and cheminformatics. ( 0,642458828156259 )
Sci Data - Long-read, whole-genome shotgun sequence data for five model organisms. ( 0,64220492151377 )
J Integr Bioinform - Complementarity of network and sequence information in homologous proteins. ( 0,64168031837076 )
J Chem Inf Model - MetalS2: a tool for the structural alignment of minimal functional sites in metal-binding proteins and nucleic acids. ( 0,640989186580963 )
Sci Data - A repository of assays to quantify 10,000 human proteins by SWATH-MS. ( 0,640893759327403 )
Comput Math Methods Med - Analyzing effects of naturally occurring missense mutations. ( 0,640214874750383 )
Comput. Biol. Med. - ExonSuite: algorithmically optimizing alternative gene splicing for the PUF proteins. ( 0,639353808094912 )
J Chem Inf Model - Kink characterization and modeling in transmembrane protein structures. ( 0,639089544045144 )
Comput Biol Chem - Practical halving; the Nelumbo nucifera evidence on early eudicot evolution. ( 0,638798572377419 )
J. Comput. Biol. - An automaton approach for waiting times in DNA evolution. ( 0,63790020216017 )
Brief. Bioinformatics - Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. ( 0,637653300471135 )
Comput Biol Chem - The frequency of poly(G) tracts in the human genome and their use as a sensor of DNA damage. ( 0,637407370751356 )
J Chem Inf Model - Performance of protein-ligand docking with simulated chemical shift perturbations. ( 0,637140170746548 )
Comput Biol Chem - In silico characterization and evolutionary analyses of CCAAT binding proteins in the lycophyte plant Selaginella moellendorffii genome: a growing comparative genomics resource. ( 0,637049822914693 )
Brief. Bioinformatics - Calculating transcription factor binding maps for chromatin. ( 0,635721497436776 )
J Chem Inf Model - Binding region of alanopine dehydrogenase predicted by unbiased molecular dynamics simulations of ligand diffusion. ( 0,635368813126304 )