J Integr Bioinform - Complementarity of network and sequence information in homologous proteins.

Tópicos

{ sequenc(1873) structur(1644) protein(1328) }
{ network(2748) neural(1063) input(814) }
{ method(1219) similar(1157) match(930) }
{ search(2224) databas(1162) retriev(909) }
{ structur(1116) can(940) graph(676) }
{ chang(1828) time(1643) increas(1301) }
{ method(984) reconstruct(947) comput(926) }
{ studi(1410) differ(1259) use(1210) }
{ perform(999) metric(946) measur(919) }
{ can(981) present(881) function(850) }
{ can(774) often(719) complex(702) }
{ group(2977) signific(1463) compar(1072) }
{ system(1976) rule(880) can(841) }
{ treatment(1704) effect(941) patient(846) }
{ concept(1167) ontolog(924) domain(897) }
{ howev(809) still(633) remain(590) }
{ studi(1119) effect(1106) posit(819) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ signal(2180) analysi(812) frequenc(800) }
{ gene(2352) biolog(1181) express(1162) }
{ use(976) code(926) identifi(902) }
{ measur(2081) correl(1212) valu(896) }
{ error(1145) method(1030) estim(1020) }
{ control(1307) perform(991) simul(935) }
{ general(901) number(790) one(736) }
{ featur(1941) imag(1645) propos(1176) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ use(2086) technolog(871) perceiv(783) }
{ drug(1928) target(777) effect(648) }
{ survey(1388) particip(1329) question(1065) }
{ decis(3086) make(1611) patient(1517) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ learn(2355) train(1041) set(1003) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ model(2341) predict(2261) use(1141) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ medic(1828) order(1363) alert(1069) }
{ cost(1906) reduc(1198) effect(832) }
{ sampl(1606) size(1419) use(1276) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ estim(2440) model(1874) function(577) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Traditional approaches for homology detection rely on finding sufficient similarities between protein sequences. Motivated by studies demonstrating that from non-sequence based sources of biological information, such as the secondary or tertiary molecular structure, we can extract certain types of biological knowledge when sequence-based approaches fail, we hypothesize that protein-protein interaction (PPI) network topology and protein sequence might give insights into different slices of biological information. Since proteins aggregate to perform a function instead of acting in isolation, analyzing complex wirings around a protein in a PPI network could give deeper insights into the protein's role in the inner working of the cell than analyzing sequences of individual genes. Hence, we believe that one could lose much information by focusing on sequence information alone. We examine whether the information about homologous proteins captured by PPI network topology differs and to what extent from the information captured by their sequences. We measure how similar the topology around homologous proteins in a PPI network is and show that such proteins have statistically significantly higher network similarity than nonhomologous proteins. We compare these network similarity trends of homologous proteins with the trends in their sequence identity and find that network similarities uncover almost as much homology as sequence identities. Although none of the two methods, network topology and sequence identity, seems to capture homology information in its entirety, we demonstrate that the two might give insights into somewhat different types of biological information, as the overlap of the homology information that they uncover is relatively low. Therefore, we conclude that similarities of proteins' topological neighborhoods in a PPI network could be used as a complementary method to sequence-based approaches for identifying homologs, as well as for analyzing evolutionary distance and functional divergence of homologous proteins.

Resumo Limpo

tradit approach homolog detect reli find suffici similar protein sequenc motiv studi demonstr nonsequ base sourc biolog inform secondari tertiari molecular structur can extract certain type biolog knowledg sequencebas approach fail hypothes proteinprotein interact ppi network topolog protein sequenc might give insight differ slice biolog inform sinc protein aggreg perform function instead act isol analyz complex wire around protein ppi network give deeper insight protein role inner work cell analyz sequenc individu gene henc believ one lose much inform focus sequenc inform alon examin whether inform homolog protein captur ppi network topolog differ extent inform captur sequenc measur similar topolog around homolog protein ppi network show protein statist signific higher network similar nonhomolog protein compar network similar trend homolog protein trend sequenc ident find network similar uncov almost much homolog sequenc ident although none two method network topolog sequenc ident seem captur homolog inform entireti demonstr two might give insight somewhat differ type biolog inform overlap homolog inform uncov relat low therefor conclud similar protein topolog neighborhood ppi network use complementari method sequencebas approach identifi homolog well analyz evolutionari distanc function diverg homolog protein

Resumos Similares

Comput Methods Programs Biomed - Protein secondary structure prediction using modular reciprocal bidirectional recurrent neural networks. ( 0,835777855368652 )
J Biomed Inform - A similarity network approach for the analysis and comparison of protein sequence/structure sets. ( 0,80388266700398 )
J Chem Inf Model - Modules identification in protein structures: the topological and geometrical solutions. ( 0,798946001315232 )
J Chem Inf Model - Comparative analysis of threshold and tessellation methods for determining protein contacts. ( 0,789957373973801 )
J. Comput. Biol. - Evaluating, comparing, and interpreting protein domain hierarchies. ( 0,775464311777612 )
Comput Biol Chem - Identification of putative and potential cross-reactive chickpea (Cicer arietinum) allergens through an in silico approach. ( 0,770678410463112 )
Brief. Bioinformatics - New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. ( 0,770103105472763 )
Comput Biol Chem - Analysis of sequence repeats of proteins in the PDB. ( 0,7689407107074 )
Comput Biol Chem - Semantically predicting protein functions based on protein functional connectivity. ( 0,759518950642186 )
Comput Biol Chem - A local average connectivity-based method for identifying essential proteins from the network level. ( 0,758523141467998 )
J Chem Inf Model - ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures. ( 0,753832422863215 )
Comput Biol Chem - ProCoCoA: A quantitative approach for analyzing protein core composition. ( 0,748341061702302 )
J. Comput. Biol. - Combinatorics of -structures. ( 0,748257251827764 )
Comput Math Methods Med - DV-curve representation of protein sequences and its application. ( 0,747358986395062 )
Comput Biol Chem - Bacterial protein structures reveal phylum dependent divergence. ( 0,745337469292433 )
Comput. Biol. Med. - An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. ( 0,744059255497685 )
J. Comput. Biol. - Optimization of profile-to-profile alignment parameters for one-dimensional threading. ( 0,743163910294898 )
J Chem Inf Model - Tertiary structure prediction of RNA-RNA complexes using a secondary structure and fragment-based method. ( 0,743049809299325 )
J. Comput. Biol. - Nonparametric combinatorial sequence models. ( 0,740986606906186 )
Comput Biol Chem - ProSTRIP: A method to find similar structural repeats in three-dimensional protein structures. ( 0,739351324710469 )
Brief. Bioinformatics - BamView: visualizing and interpretation of next-generation sequencing read alignments. ( 0,73872995003443 )
J. Comput. Biol. - LB3D: a protein three-dimensional substructure search program based on the lower bound of a root mean square deviation value. ( 0,73799875768479 )
Comput Biol Chem - Characterizing regions in the human genome unmappable by next-generation-sequencing at the read length of 1000 bases. ( 0,735481490977681 )
J Chem Inf Model - Protein secondary structure prediction with SPARROW. ( 0,7344198183843 )
BMC Med Inform Decis Mak - Efficient protein structure search using indexing methods. ( 0,733014076059879 )
Brief. Bioinformatics - De novo assembly of short sequence reads. ( 0,731306582033171 )
Comput Biol Chem - Computational insight into nitration of human myoglobin. ( 0,730243029274641 )
Comput Biol Chem - The frequency of poly(G) tracts in the human genome and their use as a sensor of DNA damage. ( 0,728411796118898 )
Comput. Biol. Med. - A content and structural assessment of oxidative motifs across a diverse set of life forms. ( 0,726090450818149 )
Comput Biol Chem - Protein fold recognition based on functional domain composition. ( 0,725578855644096 )
J Chem Inf Model - Protein secondary structure classification revisited: processing DSSP information with PSSC. ( 0,72509406639493 )
J. Comput. Biol. - IDBA-MTP: A Hybrid Metatranscriptomic Assembler Based on Protein Information. ( 0,724620181329152 )
Comput. Biol. Med. - Application of 2D graphic representation of protein sequence based on Huffman tree method. ( 0,722474703342405 )
Brief. Bioinformatics - Systematic identification of Class I HDAC substrates. ( 0,72199834677322 )
J. Comput. Biol. - Emergent protein folding modeled with evolved neural cellular automata using the 3D HP model. ( 0,721544634905938 )
Comput Biol Chem - Computational determination of the orientation of a heat repeat-like domain of DNA-PKcs. ( 0,720988127380666 )
J. Comput. Biol. - Simultaneous alignment and folding of protein sequences. ( 0,720639077014756 )
Comput Math Methods Med - Identification of antioxidants from sequence information using na?ve Bayes. ( 0,720185599421889 )
Comput Biol Chem - Statistical analysis and exposure status classification of transmembrane beta barrel residues. ( 0,717673852689277 )
Comput Biol Chem - Predicting protein-protein interactions using graph invariants and a neural network. ( 0,717560203079434 )
Comput Biol Chem - Human-chimpanzee alignment: ortholog exponentials and paralog power laws. ( 0,717507997622358 )
Brief. Bioinformatics - Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. ( 0,716937415900049 )
J Chem Inf Model - Improved helix and kink characterization in membrane proteins allows evaluation of kink sequence predictors. ( 0,713298139677206 )
J Chem Inf Model - Parallel and antiparallel ?-strands differ in amino acid composition and availability of short constituent sequences. ( 0,71226501808949 )
Sci Data - Comprehensive analysis of the venom gland transcriptome of the spider Dolomedes fimbriatus. ( 0,711236854407282 )
Comput. Biol. Med. - Improving protein secondary structure prediction using a multi-modal BP method. ( 0,711177322007951 )
J Integr Bioinform - Prediction of thioredoxin and glutaredoxin target proteins by identifying reversibly oxidized cysteinyl residues. ( 0,710534758137827 )
Sci Data - Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus. ( 0,708893358297907 )
Curr Protoc Bioinformatics - Using the RNAstructure Software Package to Predict Conserved RNA Structures. ( 0,704403457412427 )
Comput. Biol. Med. - miRClassify: an advanced web server for miRNA family classification and annotation. ( 0,70438347029593 )
Comput Math Methods Med - Uses of phage display in agriculture: sequence analysis and comparative modeling of late embryogenesis abundant client proteins suggest protein-nucleic acid binding functionality. ( 0,702791165737609 )
Brief. Bioinformatics - Ortholog identification in the presence of domain architecture rearrangement. ( 0,698296115374781 )
J Chem Inf Model - Proteins as sponges: a statistical journey along protein structure organization principles. ( 0,69631191634948 )
J Integr Bioinform - Predicting protein distance maps according to physicochemical properties. ( 0,695554895734421 )
Comput Biol Chem - Large replication skew domains delimit GC-poor gene deserts in human. ( 0,695500827906241 )
J. Comput. Biol. - ComB: SNP calling and mapping analysis for color and nucleotide space platforms. ( 0,695370457392142 )
J. Comput. Biol. - Reconstructing the history of large-scale genomic changes: biological questions and computational challenges. ( 0,695087625270986 )
Comput. Biol. Med. - New layers in understanding and predicting a-linolenic acid content in plants using amino acid characteristics of omega-3 fatty acid desaturase. ( 0,693909580429039 )
J Chem Inf Model - Building a knowledge-based statistical potential by capturing high-order inter-residue interactions and its applications in protein secondary structure assessment. ( 0,692996421501125 )
J. Comput. Biol. - Statistical significance of threading scores. ( 0,689307096439987 )
J Chem Inf Model - MetalS2: a tool for the structural alignment of minimal functional sites in metal-binding proteins and nucleic acids. ( 0,689237688587229 )
Comput Methods Programs Biomed - Can computational biology improve the phylogenetic analysis of insulin? ( 0,688164238916747 )
Comput. Biol. Med. - A context evaluation approach for structural comparison of proteins using cross entropy over n-gram modelling. ( 0,687917167315046 )
Brief. Bioinformatics - Base-calling for next-generation sequencing platforms. ( 0,687335014697653 )
Comput Biol Chem - Understanding the general packing rearrangements required for successful template based modeling of protein structure from a CASP experiment. ( 0,687027361918041 )
Comput. Biol. Med. - A protein mapping method based on physicochemical properties and dimension reduction. ( 0,685627792672869 )
Comput Biol Chem - Multi-nucleation and vectorial folding pathways of large helix protein. ( 0,685254776125104 )
Comput Biol Chem - The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam. ( 0,684485319774042 )
J Chem Inf Model - Kink characterization and modeling in transmembrane protein structures. ( 0,684358209363143 )
Comput Biol Chem - A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses. ( 0,683297297160585 )
Med Biol Eng Comput - The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: semi-supervised classification of class C GPCRs. ( 0,682846534663124 )
Comput. Biol. Med. - Structural alphabet motif discovery and a structural motif database. ( 0,682250101843306 )
Brief. Bioinformatics - A practical guide for the computational selection of residues to be experimentally characterized in protein families. ( 0,682174726315597 )
J Integr Bioinform - Exceptional single strand DNA word symmetry: analysis of evolutionary potentialities. ( 0,681629610109786 )
Comput. Biol. Med. - Prediction of protein functions based on function-function correlation relations. ( 0,681065444357448 )
Comput Biol Chem - Identification and characterization of lysine-methylated sites on histones and non-histone proteins. ( 0,679434891696572 )
Brief. Bioinformatics - Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. ( 0,679274977160437 )
J. Comput. Biol. - Mapping reads on a genomic sequence: an algorithmic overview and a practical comparative analysis. ( 0,678089023343033 )
Comput Math Methods Med - Quad-PRE: a hybrid method to predict protein quaternary structure attributes. ( 0,677436845573718 )
J Chem Inf Model - Protein structural statistics with PSS. ( 0,674553735312591 )
Brief. Bioinformatics - Identifying protein complexes and functional modules--from static PPI networks to dynamic PPI networks. ( 0,674154920299911 )
Comput. Biol. Med. - Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets. ( 0,673002561105574 )
Comput Biol Chem - Identical sequence patterns in the ends of exons and introns of human protein-coding genes. ( 0,671839767415247 )
Comput Biol Chem - Error compensation of tRNA misacylation by codon-anticodon mismatch prevents translational amino acid misinsertion. ( 0,670713442042625 )
J Integr Bioinform - A hierarchical approach to protein fold prediction. ( 0,669362359539696 )
J. Comput. Biol. - Sequence alignment of viral channel proteins with cellular ion channels. ( 0,666861001801728 )
J. Comput. Biol. - Smoothing 3D protein structure motifs through graph mining and amino acid similarities. ( 0,665480489006342 )
J Chem Inf Model - Context-based features enhance protein secondary structure prediction accuracy. ( 0,663181418980413 )
Med Biol Eng Comput - Enhanced spatio-temporal alignment of plantar pressure image sequences using B-splines. ( 0,662780524647131 )
Sci Data - Long-read, whole-genome shotgun sequence data for five model organisms. ( 0,659785741849378 )
J. Comput. Biol. - Statistical significance of optical map alignments. ( 0,658284355664603 )
Curr Protoc Bioinformatics - Comparative Protein Structure Modeling Using MODELLER. ( 0,657783477577514 )
Comput. Biol. Med. - Intron identification approaches based on weighted features and fuzzy decision trees. ( 0,657141300571067 )
J. Comput. Biol. - Separating significant matches from spurious matches in DNA sequences. ( 0,656565517253762 )
Comput Math Methods Med - ADLD: a novel graphical representation of protein sequences and its application. ( 0,652292591929132 )
J. Comput. Biol. - AREM: aligning short reads from ChIP-sequencing by expectation maximization. ( 0,652268767805531 )
J Chem Inf Model - Searching for likeness in a database of macromolecular complexes. ( 0,652180531670007 )
J. Comput. Biol. - A novel technique for detecting putative horizontal gene transfer in the sequence space. ( 0,651572952623719 )
J. Comput. Biol. - Accurate estimations of evolutionary times in the context of strong CpG hypermutability. ( 0,64948129443884 )
Comput. Biol. Med. - Signal peptide discrimination and cleavage site identification using SVM and NN. ( 0,64822991307253 )