J Biomed Inform - A similarity network approach for the analysis and comparison of protein sequence/structure sets.

Tópicos

{ sequenc(1873) structur(1644) protein(1328) }
{ network(2748) neural(1063) input(814) }
{ structur(1116) can(940) graph(676) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ framework(1458) process(801) describ(734) }
{ learn(2355) train(1041) set(1003) }
{ featur(3375) classif(2383) classifi(1994) }
{ data(3963) clinic(1234) research(1004) }
{ model(2656) set(1616) predict(1553) }
{ analysi(2126) use(1163) compon(1037) }
{ method(1969) cluster(1462) data(1082) }
{ imag(1057) registr(996) error(939) }
{ studi(2440) review(1878) systemat(933) }
{ blood(1257) pressur(1144) flow(957) }
{ state(1844) use(1261) util(961) }
{ age(1611) year(1155) adult(843) }
{ data(3008) multipl(1320) sourc(1022) }
{ use(2086) technolog(871) perceiv(783) }
{ high(1669) rate(1365) level(1280) }
{ use(1733) differ(960) four(931) }
{ system(1976) rule(880) can(841) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ data(1714) softwar(1251) tool(1186) }
{ featur(1941) imag(1645) propos(1176) }
{ howev(809) still(633) remain(590) }
{ system(1050) medic(1026) inform(1018) }
{ model(2341) predict(2261) use(1141) }
{ sampl(1606) size(1419) use(1276) }
{ can(981) present(881) function(850) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ bind(1733) structur(1185) ligand(1036) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ research(1085) discuss(1038) issu(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ health(1844) social(1437) communiti(874) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

A set of proteins is a complex system whose elements are interrelated on the concept of sequence- and structure-based similarity. Here, we applied a similarity network-based methodology for the representation and analysis of protein sequences and structures sets using a non-redundant set of 311 proteins and three different information criteria based on sequence-derived features, sequence local alignment and structural alignment. A wide set of measurements, like network degree, clustering coefficient, characteristic path length and vertex centrality were utilized to characterize the networks' topology. Protein similarity networks were found medium or highly interconnected and the existence of both clusters and random edges classified their fully connected versions as Small World Networks (SWNs). The SWN architecture was able to host the continuous similarity transition among proteins and model the protein information flow during evolution. Recently reported ancestral elements, like the alpha/beta class and certain folds, were remarkably found to act as hubs in the networks. Additionally, the moderate information value of sequence-derived features when used for fold and class assignment was shown on a network basis. The methodology described here can be applied for the analysis of other complex systems which consist of interrelated elements and a certain information flow.

Resumo Limpo

set protein complex system whose element interrel concept sequenc structurebas similar appli similar networkbas methodolog represent analysi protein sequenc structur set use nonredund set protein three differ inform criteria base sequencederiv featur sequenc local align structur align wide set measur like network degre cluster coeffici characterist path length vertex central util character network topolog protein similar network found medium high interconnect exist cluster random edg classifi fulli connect version small world network swns swn architectur abl host continu similar transit among protein model protein inform flow evolut recent report ancestr element like alphabeta class certain fold remark found act hub network addit moder inform valu sequencederiv featur use fold class assign shown network basi methodolog describ can appli analysi complex system consist interrel element certain inform flow

Resumos Similares

Comput Methods Programs Biomed - Protein secondary structure prediction using modular reciprocal bidirectional recurrent neural networks. ( 0,893825434438664 )
J Chem Inf Model - Protein secondary structure classification revisited: processing DSSP information with PSSC. ( 0,810258216215213 )
J Integr Bioinform - Complementarity of network and sequence information in homologous proteins. ( 0,80388266700398 )
J. Comput. Biol. - Evaluating, comparing, and interpreting protein domain hierarchies. ( 0,798673307524169 )
J. Comput. Biol. - Combinatorics of -structures. ( 0,785909256844763 )
J Chem Inf Model - Modules identification in protein structures: the topological and geometrical solutions. ( 0,780422093190293 )
Comput Biol Chem - A novel feature representation method based on Chou's pseudo amino acid composition for protein structural class prediction. ( 0,769644744431457 )
Comput. Biol. Med. - A protein mapping method based on physicochemical properties and dimension reduction. ( 0,769022650623611 )
J. Comput. Biol. - Detection of structural variants involving repetitive regions in the reference genome. ( 0,762049348892784 )
Comput Biol Chem - ProCoCoA: A quantitative approach for analyzing protein core composition. ( 0,761382912416476 )
Comput Biol Chem - Identification of putative and potential cross-reactive chickpea (Cicer arietinum) allergens through an in silico approach. ( 0,760039736587612 )
J Chem Inf Model - Protein secondary structure prediction with SPARROW. ( 0,758559554417538 )
Comput Biol Chem - A local average connectivity-based method for identifying essential proteins from the network level. ( 0,756866541049123 )
J. Comput. Biol. - Statistical significance of optical map alignments. ( 0,754366983561438 )
Comput Math Methods Med - DV-curve representation of protein sequences and its application. ( 0,753912489173513 )
Brief. Bioinformatics - De novo assembly of short sequence reads. ( 0,753881948960917 )
Comput Biol Chem - ProSTRIP: A method to find similar structural repeats in three-dimensional protein structures. ( 0,749676135822782 )
J Chem Inf Model - Comparative analysis of threshold and tessellation methods for determining protein contacts. ( 0,749273694422693 )
J. Comput. Biol. - Emergent protein folding modeled with evolved neural cellular automata using the 3D HP model. ( 0,748828339696438 )
Comput. Biol. Med. - New layers in understanding and predicting a-linolenic acid content in plants using amino acid characteristics of omega-3 fatty acid desaturase. ( 0,748590179826198 )
Comput Biol Chem - Statistical analysis and exposure status classification of transmembrane beta barrel residues. ( 0,746751582147815 )
Comput Biol Chem - Characterizing regions in the human genome unmappable by next-generation-sequencing at the read length of 1000 bases. ( 0,746525352657441 )
BMC Med Inform Decis Mak - Efficient protein structure search using indexing methods. ( 0,74558492234987 )
Comput Biol Chem - Analysis of sequence repeats of proteins in the PDB. ( 0,744377382692919 )
Comput. Biol. Med. - Improving protein secondary structure prediction using a multi-modal BP method. ( 0,74419812266032 )
J. Comput. Biol. - Parallel continuous flow: a parallel suffix tree construction tool for whole genomes. ( 0,742490021783558 )
Comput Biol Chem - The frequency of poly(G) tracts in the human genome and their use as a sensor of DNA damage. ( 0,741825506660728 )
J. Comput. Biol. - ComB: SNP calling and mapping analysis for color and nucleotide space platforms. ( 0,741547858135007 )
Comput. Biol. Med. - An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. ( 0,738582445005555 )
IEEE Trans Vis Comput Graph - Dynamic Network Visualization with Extended Massive Sequence Views. ( 0,737661470697704 )
Comput Biol Chem - Bacterial protein structures reveal phylum dependent divergence. ( 0,737233167740456 )
J. Comput. Biol. - IDBA-MTP: A Hybrid Metatranscriptomic Assembler Based on Protein Information. ( 0,734728617555128 )
Brief. Bioinformatics - Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. ( 0,730525563893891 )
Brief. Bioinformatics - Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. ( 0,72894460985275 )
Comput Biol Chem - Protein fold recognition based on functional domain composition. ( 0,728664497367326 )
Brief. Bioinformatics - Ultrafast clustering algorithms for metagenomic sequence analysis. ( 0,728402512835442 )
Brief. Bioinformatics - New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. ( 0,726229474784703 )
J Chem Inf Model - Parallel and antiparallel ?-strands differ in amino acid composition and availability of short constituent sequences. ( 0,725000006325812 )
J Chem Inf Model - Context-based features enhance protein secondary structure prediction accuracy. ( 0,722460805930896 )
Comput Biol Chem - The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam. ( 0,721387637245438 )
J. Comput. Biol. - Simultaneous alignment and folding of protein sequences. ( 0,72131227643937 )
Curr Protoc Bioinformatics - Using the RNAstructure Software Package to Predict Conserved RNA Structures. ( 0,720158092870578 )
Comput Math Methods Med - Quad-PRE: a hybrid method to predict protein quaternary structure attributes. ( 0,718091333099378 )
Comput. Biol. Med. - A content and structural assessment of oxidative motifs across a diverse set of life forms. ( 0,717540548410922 )
Brief. Bioinformatics - Identifying protein complexes and functional modules--from static PPI networks to dynamic PPI networks. ( 0,715851757590055 )
J. Comput. Biol. - Nonparametric combinatorial sequence models. ( 0,715677544297623 )
J Chem Inf Model - ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures. ( 0,715142551969763 )
Comput Biol Chem - Human-chimpanzee alignment: ortholog exponentials and paralog power laws. ( 0,714221367006092 )
Comput. Biol. Med. - miRClassify: an advanced web server for miRNA family classification and annotation. ( 0,713510080279942 )
Comput Biol Chem - Tracing the evolution of the mitochondrial protein import machinery. ( 0,71341354210604 )
Med Biol Eng Comput - The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: semi-supervised classification of class C GPCRs. ( 0,712542854091756 )
Comput Biol Chem - Subgrouping Automata: automatic sequence subgrouping using phylogenetic tree-based optimum subgrouping algorithm. ( 0,711380675751423 )
J Chem Inf Model - Tertiary structure prediction of RNA-RNA complexes using a secondary structure and fragment-based method. ( 0,709536533144172 )
J. Comput. Biol. - A theoretical model for whole genome alignment. ( 0,707849938185857 )
Comput. Biol. Med. - Application of 2D graphic representation of protein sequence based on Huffman tree method. ( 0,707518920162606 )
Comput Biol Chem - Identification and characterization of lysine-methylated sites on histones and non-histone proteins. ( 0,705918576348677 )
Comput Biol Chem - Support vector machine with a Pearson VII function kernel for discriminating halophilic and non-halophilic proteins. ( 0,704918815611203 )
J Chem Inf Model - Improved helix and kink characterization in membrane proteins allows evaluation of kink sequence predictors. ( 0,702861846035044 )
Comput Biol Chem - Semantically predicting protein functions based on protein functional connectivity. ( 0,702474305048755 )
Sci Data - Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus. ( 0,699319858982761 )
Comput Biol Chem - Computational determination of the orientation of a heat repeat-like domain of DNA-PKcs. ( 0,698883408189179 )
J. Comput. Biol. - Sequence alignment of viral channel proteins with cellular ion channels. ( 0,697583042272339 )
Comput Methods Programs Biomed - Pinda: a web service for detection and analysis of intraspecies gene duplication events. ( 0,697015866663521 )
J Chem Inf Model - Building a knowledge-based statistical potential by capturing high-order inter-residue interactions and its applications in protein secondary structure assessment. ( 0,696932990903782 )
J Integr Bioinform - A hierarchical approach to protein fold prediction. ( 0,696853095623544 )
Brief. Bioinformatics - BamView: visualizing and interpretation of next-generation sequencing read alignments. ( 0,696643682915532 )
Comput Biol Chem - Computational insight into nitration of human myoglobin. ( 0,695804279499186 )
Brief. Bioinformatics - A practical guide for the computational selection of residues to be experimentally characterized in protein families. ( 0,695330188026791 )
Brief. Bioinformatics - Systematic identification of Class I HDAC substrates. ( 0,694807931789395 )
Comput. Biol. Med. - Structural alphabet motif discovery and a structural motif database. ( 0,692635644809415 )
J. Comput. Biol. - Statistical significance of threading scores. ( 0,690663785792387 )
Comput Biol Chem - Large replication skew domains delimit GC-poor gene deserts in human. ( 0,687602930929048 )
Comput. Biol. Med. - Intron identification approaches based on weighted features and fuzzy decision trees. ( 0,684786217164338 )
J Chem Inf Model - Protein structural statistics with PSS. ( 0,683516977476225 )
J Chem Inf Model - Dihedral-based segment identification and classification of biopolymers II: polynucleotides. ( 0,682284639131115 )
Comput. Biol. Med. - Signal peptide discrimination and cleavage site identification using SVM and NN. ( 0,681834095329978 )
Comput Math Methods Med - Uses of phage display in agriculture: sequence analysis and comparative modeling of late embryogenesis abundant client proteins suggest protein-nucleic acid binding functionality. ( 0,680328598321178 )
Comput Methods Programs Biomed - Discriminating protein structure classes by incorporating Pseudo Average Chemical Shift to Chou's general PseAAC and Support Vector Machine. ( 0,677763901069829 )
J Integr Bioinform - Exceptional single strand DNA word symmetry: analysis of evolutionary potentialities. ( 0,6765233851636 )
Comput. Biol. Med. - Improving protein complex classification accuracy using amino acid composition profile. ( 0,675998828646501 )
Brief. Bioinformatics - Base-calling for next-generation sequencing platforms. ( 0,674477140717102 )
J. Comput. Biol. - Efficient traversal of beta-sheet protein folding pathways using ensemble models. ( 0,674188456623999 )
Comput Methods Programs Biomed - Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models. ( 0,673487472955266 )
J Integr Bioinform - Prediction of thioredoxin and glutaredoxin target proteins by identifying reversibly oxidized cysteinyl residues. ( 0,672247390173928 )
Comput Math Methods Med - ADLD: a novel graphical representation of protein sequences and its application. ( 0,669162512972204 )
Sci Data - Comprehensive analysis of the venom gland transcriptome of the spider Dolomedes fimbriatus. ( 0,668633927795707 )
Sci Data - A repository of assays to quantify 10,000 human proteins by SWATH-MS. ( 0,666425640509396 )
J Integr Bioinform - Predicting protein distance maps according to physicochemical properties. ( 0,661717665401929 )
J. Comput. Biol. - A novel technique for detecting putative horizontal gene transfer in the sequence space. ( 0,661684331500543 )
Curr Protoc Bioinformatics - Comparative Protein Structure Modeling Using MODELLER. ( 0,661016587431988 )
Comput Biol Chem - Multi-nucleation and vectorial folding pathways of large helix protein. ( 0,660488875544008 )
Comput. Biol. Med. - Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets. ( 0,658257221634207 )
J. Comput. Biol. - Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. ( 0,658137360758383 )
Wiley Interdiscip Rev Syst Biol Med - Functional genomics of the brain: uncovering networks in the CNS using a systems approach. ( 0,655709262820326 )
J. Comput. Biol. - Computational techniques for human genome resequencing using mated gapped reads. ( 0,655694148713753 )
Comput Biol Chem - Identical sequence patterns in the ends of exons and introns of human protein-coding genes. ( 0,654497049885618 )
J. Comput. Biol. - Optimization of profile-to-profile alignment parameters for one-dimensional threading. ( 0,65232314737623 )
Comput. Biol. Med. - Haemophilus influenzae Genome Database (HIGDB): a single point web resource for Haemophilus influenzae. ( 0,648029226585619 )
J. Comput. Biol. - Enhancing Gibbs sampling method for motif finding in DNA with initial graph representation of sequences. ( 0,648007469436655 )
Brief. Bioinformatics - Ortholog identification in the presence of domain architecture rearrangement. ( 0,647907775569588 )