Brief. Bioinformatics - Detecting coevolving positions in a molecule: why and how to account for phylogeny.

Tópicos

{ sequenc(1873) structur(1644) protein(1328) }
{ model(2656) set(1616) predict(1553) }
{ research(1085) discuss(1038) issu(1018) }
{ import(1318) role(1303) understand(862) }
{ take(945) account(800) differ(722) }
{ featur(1941) imag(1645) propos(1176) }
{ measur(2081) correl(1212) valu(896) }
{ howev(809) still(633) remain(590) }
{ can(774) often(719) complex(702) }
{ featur(3375) classif(2383) classifi(1994) }
{ perform(999) metric(946) measur(919) }
{ use(2086) technolog(871) perceiv(783) }
{ health(1844) social(1437) communiti(874) }
{ compound(1573) activ(1297) structur(1058) }
{ survey(1388) particip(1329) question(1065) }
{ detect(2391) sensit(1101) algorithm(908) }
{ method(1219) similar(1157) match(930) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ motion(1329) object(1292) video(1091) }
{ concept(1167) ontolog(924) domain(897) }
{ data(3008) multipl(1320) sourc(1022) }
{ analysi(2126) use(1163) compon(1037) }
{ data(1737) use(1416) pattern(1282) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ imag(2830) propos(1344) filter(1198) }
{ design(1359) user(1324) use(1319) }
{ visual(1396) interact(850) tool(830) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ high(1669) rate(1365) level(1280) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ method(1969) cluster(1462) data(1082) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ system(1050) medic(1026) inform(1018) }
{ model(2341) predict(2261) use(1141) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ can(981) present(881) function(850) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ estim(2440) model(1874) function(577) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }

Resumo

Positions in a molecule that share a common constraint do not evolve independently, and therefore leave a signature in the patterns of homologous sequences. Exhibiting such positions with a coevolution pattern from a sequence alignment has great potential for predicting functional and structural properties of molecules through comparative analysis. This task is complicated by the existence of additional correlation sources, leading to false predictions. The nature of the data is a major source of noise correlation: sequences are taken from individuals with different degrees of relatedness, and who therefore are intrinsically correlated. This has led to several method developments in different fields that are potentially confusing for non-expert users interested in these methodologies. It also explains why coevolution detection methods are largely unemployed despite the importance of the biological questions they address. In this article, I focus on the role of shared ancestry for understanding molecular coevolution patterns. I review and classify existing coevolution detection methods according to their ability to handle shared ancestry. Using a ribosomal RNA benchmark data set, for which detailed knowledge of the structure and coevolution patterns is available, I demonstrate and explain why taking the underlying evolutionary history of sequences into account is the only way to extract the full coevolution signal in the data. I also evaluate, using rigorous statistical procedures, the best approaches to do so, and discuss several important biological aspects to consider when performing coevolution analyses.

Resumo Limpo

posit molecul share common constraint evolv independ therefor leav signatur pattern homolog sequenc exhibit posit coevolut pattern sequenc align great potenti predict function structur properti molecul compar analysi task complic exist addit correl sourc lead fals predict natur data major sourc nois correl sequenc taken individu differ degre related therefor intrins correl led sever method develop differ field potenti confus nonexpert user interest methodolog also explain coevolut detect method larg unemploy despit import biolog question address articl focus role share ancestri understand molecular coevolut pattern review classifi exist coevolut detect method accord abil handl share ancestri use ribosom rna benchmark data set detail knowledg structur coevolut pattern avail demonstr explain take under evolutionari histori sequenc account way extract full coevolut signal data also evalu use rigor statist procedur best approach discuss sever import biolog aspect consid perform coevolut analys

Resumos Similares

Comput Methods Programs Biomed - Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models. ( 0,678734575579502 )
Comput Math Methods Med - Analyzing effects of naturally occurring missense mutations. ( 0,667837695805118 )
J. Comput. Biol. - Reconstructing the history of large-scale genomic changes: biological questions and computational challenges. ( 0,649562349298221 )
Comput Biol Chem - ProCoCoA: A quantitative approach for analyzing protein core composition. ( 0,621062807199251 )
Comput. Biol. Med. - Remote homology detection incorporating the context of physicochemical properties. ( 0,611574441852593 )
J. Comput. Biol. - Accurate mass spectrometry based protein quantification via shared peptides. ( 0,608375710431572 )
Brief. Bioinformatics - Bioinformatics tools for the structural elucidation of multi-subunit protein complexes by mass spectrometric analysis of protein-protein cross-links. ( 0,603900276929638 )
J. Comput. Biol. - Combinatorics of -structures. ( 0,603584408890191 )
J Chem Inf Model - Critical residue that promotes protein dimerization: a story of partially exposed Phe25 in 14-3-3s. ( 0,602417042049012 )
Comput Biol Chem - Identification and characterization of lysine-methylated sites on histones and non-histone proteins. ( 0,602097894633366 )
Comput Biol Chem - Protein fold recognition based on functional domain composition. ( 0,591040112126604 )
Comput Methods Programs Biomed - Protein secondary structure prediction using modular reciprocal bidirectional recurrent neural networks. ( 0,588863549232262 )
J Integr Bioinform - Predicting protein distance maps according to physicochemical properties. ( 0,587772770719892 )
Comput. Biol. Med. - Improving protein complex classification accuracy using amino acid composition profile. ( 0,586427834528557 )
Comput Math Methods Med - DV-curve representation of protein sequences and its application. ( 0,585817934046282 )
Brief. Bioinformatics - Base-calling for next-generation sequencing platforms. ( 0,583600419721154 )
Brief. Bioinformatics - Systematic identification of Class I HDAC substrates. ( 0,582692622356682 )
Sci Data - A draft genome for the African crocodilian trypanosome Trypanosoma grayi. ( 0,580368281220557 )
Comput Biol Chem - Statistical analysis and exposure status classification of transmembrane beta barrel residues. ( 0,580355028131094 )
Comput Biol Chem - A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses. ( 0,578160936484123 )
Comput. Biol. Med. - Improving protein secondary structure prediction using a multi-modal BP method. ( 0,576184463419775 )
J Chem Inf Model - Proteins as sponges: a statistical journey along protein structure organization principles. ( 0,574780871174521 )
J Chem Inf Model - PocketAlign a novel algorithm for aligning binding sites in protein structures. ( 0,574156669024277 )
Brief. Bioinformatics - Applications of alignment-free methods in epigenomics. ( 0,573238600988197 )
Curr Protoc Bioinformatics - Using the RNAstructure Software Package to Predict Conserved RNA Structures. ( 0,57234086794679 )
Curr Protoc Bioinformatics - An Overview of RNA Sequence Analyses: Structure Prediction, ncRNA Gene Identification, and RNAi Design. ( 0,570927529399511 )
J Integr Bioinform - Exceptional single strand DNA word symmetry: analysis of evolutionary potentialities. ( 0,569241978539178 )
Comput Biol Chem - A protein fold classifier formed by fusing different modes of pseudo amino acid composition via PSSM. ( 0,568065010993017 )
Brief. Bioinformatics - A practical guide for the computational selection of residues to be experimentally characterized in protein families. ( 0,567689159847132 )
Comput. Biol. Med. - LRRsearch: An asynchronous server-based application for the prediction of leucine-rich repeat motifs and an integrative database of NOD-like receptors. ( 0,564464414147751 )
J Integr Bioinform - A hierarchical approach to protein fold prediction. ( 0,564432639254528 )
Comput. Biol. Med. - Prediction of methylation CpGs and their methylation degrees in human DNA sequences. ( 0,563496293039462 )
Comput Biol Chem - Practical halving; the Nelumbo nucifera evidence on early eudicot evolution. ( 0,561754414704854 )
J. Comput. Biol. - Simultaneous alignment and folding of protein sequences. ( 0,560398755940508 )
Comput Biol Chem - Bacterial protein structures reveal phylum dependent divergence. ( 0,557312196447477 )
Brief. Bioinformatics - Alpha shape and Delaunay triangulation in studies of protein-related interactions. ( 0,556638319529841 )
Comput Biol Chem - Complexity measures for the evolutionary categorization of organisms. ( 0,555490721816915 )
Brief. Bioinformatics - De novo assembly of short sequence reads. ( 0,554997088487998 )
J Chem Inf Model - GRID-based three-dimensional pharmacophores II: PharmBench, a benchmark data set for evaluating pharmacophore elucidation methods. ( 0,553776394464363 )
J. Comput. Biol. - Statistical significance of normalized global alignment. ( 0,553431861591059 )
J Chem Inf Model - Structural role of uracil DNA glycosylase for the recognition of uracil in DNA duplexes. Clues from atomistic simulations. ( 0,551310866769371 )
Comput Biol Chem - Identical sequence patterns in the ends of exons and introns of human protein-coding genes. ( 0,550989534659412 )
J. Comput. Biol. - In silico prediction of escherichia coli proteins targeting the host cell nucleus, with special reference to their role in colon cancer etiology. ( 0,546591880531519 )
Comput Biol Chem - A local average connectivity-based method for identifying essential proteins from the network level. ( 0,546478914011133 )
J Chem Inf Model - Dihedral-based segment identification and classification of biopolymers II: polynucleotides. ( 0,545905837446531 )
Comput Math Methods Med - Uses of phage display in agriculture: sequence analysis and comparative modeling of late embryogenesis abundant client proteins suggest protein-nucleic acid binding functionality. ( 0,545621885303022 )
Wiley Interdiscip Rev Syst Biol Med - From base pair to bedside: molecular simulation and the translation of genomics to personalized medicine. ( 0,545117569687635 )
Brief. Bioinformatics - Functional assignment of metagenomic data: challenges and applications. ( 0,545109839454339 )
BMC Med Inform Decis Mak - Predicting the start week of respiratory syncytial virus outbreaks using real time weather variables. ( 0,544878718879597 )
J Chem Inf Model - Dihedral-based segment identification and classification of biopolymers I: proteins. ( 0,544874388063464 )
J. Comput. Biol. - Evaluating, comparing, and interpreting protein domain hierarchies. ( 0,544743205236479 )
J Chem Inf Model - Protein secondary structure classification revisited: processing DSSP information with PSSC. ( 0,544587415700343 )
Brief. Bioinformatics - Network inference from AP-MS data: computational challenges and solutions. ( 0,54360187975439 )
Comput. Biol. Med. - An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. ( 0,543142861995962 )
J Chem Inf Model - Tertiary structure prediction of RNA-RNA complexes using a secondary structure and fragment-based method. ( 0,541179630770813 )
Brief. Bioinformatics - Positional orthology: putting genomic evolutionary relationships into context. ( 0,540040825254259 )
Med Biol Eng Comput - Quantitative calculation of human melatonin suppression induced by inappropriate light at night. ( 0,539908409688554 )
J. Comput. Biol. - Enhancing Gibbs sampling method for motif finding in DNA with initial graph representation of sequences. ( 0,539301496040664 )
Sci Data - A repository of assays to quantify 10,000 human proteins by SWATH-MS. ( 0,538848442290377 )
J. Comput. Biol. - A novel technique for detecting putative horizontal gene transfer in the sequence space. ( 0,538768748665407 )
Comput. Biol. Med. - ProClusEnsem: predicting membrane protein types by fusing different modes of pseudo amino acid composition. ( 0,537885468663468 )
J. Comput. Biol. - ComB: SNP calling and mapping analysis for color and nucleotide space platforms. ( 0,537405813010239 )
Med Biol Eng Comput - Characterization and prediction of mRNA polyadenylation sites in human genes. ( 0,537201956352948 )
Comput Biol Chem - Analysis of sequence repeats of proteins in the PDB. ( 0,53694859057487 )
Brief. Bioinformatics - A survey of sequence alignment algorithms for next-generation sequencing. ( 0,535221054130121 )
Comput Biol Chem - Comparison of linear gap penalties and profile-based variable gap penalties in profile-profile alignments. ( 0,534462423140635 )
Comput. Biol. Med. - MitProt-Pred: Predicting mitochondrial proteins of Plasmodium falciparum parasite using diverse physiochemical properties and ensemble classification. ( 0,53372836376427 )
Brief. Bioinformatics - DRISEE overestimates errors in metagenomic sequencing data. ( 0,53319126384827 )
Comput Biol Chem - Analysis of compensatory substitution and gene evolution on the MAGEA/CSAG-palindrome of the primate X chromosomes. ( 0,532992097320102 )
Comput Biol Chem - Inferring biological basis about psychrophilicity by interpreting the rules generated from the correctly classified input instances by a classifier. ( 0,531959577900858 )
Brief. Bioinformatics - The what, where, how and why of gene ontology--a primer for bioinformaticians. ( 0,531375442274481 )
J. Comput. Biol. - An automaton approach for waiting times in DNA evolution. ( 0,531171141886912 )
J. Comput. Biol. - Mapping reads on a genomic sequence: an algorithmic overview and a practical comparative analysis. ( 0,530316207661022 )
Brief. Bioinformatics - Protein inference: a review. ( 0,530293140846647 )
Comput Biol Chem - Tracing the evolution of the mitochondrial protein import machinery. ( 0,5300216185695 )
Brief. Bioinformatics - Computational challenges of sequence classification in microbiomic data. ( 0,529784745008334 )
Comput. Biol. Med. - Structural alphabet motif discovery and a structural motif database. ( 0,529392540210887 )
Comput Biol Chem - Characterizing regions in the human genome unmappable by next-generation-sequencing at the read length of 1000 bases. ( 0,528015471463114 )
IEEE Trans Image Process - Pattern masking estimation in image with structural uncertainty. ( 0,527431510866348 )
BMC Med Inform Decis Mak - Efficient protein structure search using indexing methods. ( 0,527340819928759 )
J Chem Inf Model - Modules identification in protein structures: the topological and geometrical solutions. ( 0,527023184178241 )
Comput Biol Chem - Systematic analysis of an amidase domain CHAP in 12 Staphylococcus aureus genomes and 44 staphylococcal phage genomes. ( 0,525053052589051 )
Sci Data - Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus. ( 0,524842322414792 )
J Chem Inf Model - Context-based features enhance protein secondary structure prediction accuracy. ( 0,522524695562583 )
J Chem Inf Model - Improved helix and kink characterization in membrane proteins allows evaluation of kink sequence predictors. ( 0,521812321026492 )
J. Comput. Biol. - Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. ( 0,521555873010125 )
Sci Data - Long-read, whole-genome shotgun sequence data for five model organisms. ( 0,52151480846187 )
Comput Biol Chem - Computational insight into nitration of human myoglobin. ( 0,520471611170123 )
Comput. Biol. Med. - Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study. ( 0,520042155591261 )
Comput Biol Chem - PPM-Dom: a novel method for domain position prediction. ( 0,519934118122563 )
J Chem Inf Model - ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures. ( 0,519870655910679 )
Comput Biol Chem - A novel feature representation method based on Chou's pseudo amino acid composition for protein structural class prediction. ( 0,51913843657789 )
Comput Biol Chem - Direct correlation analysis improves fold recognition. ( 0,51913843657789 )
J. Comput. Biol. - The complexity of the dirichlet model for multiple alignment data. ( 0,517888192691858 )
J Biomed Inform - MysiRNA: improving siRNA efficacy prediction using a machine-learning model combining multi-tools and whole stacking energy (G). ( 0,517736666040158 )
J Integr Bioinform - Probabilistic latent semantic analysis applied to whole bacterial genomes identifies common genomic features. ( 0,517138382957913 )
Brief. Bioinformatics - The challenges of delivering bioinformatics training in the analysis of high-throughput data. ( 0,516542877493752 )
Comput Math Methods Med - Quad-PRE: a hybrid method to predict protein quaternary structure attributes. ( 0,516107393881452 )
Brief. Bioinformatics - New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. ( 0,51598183276594 )
Comput Math Methods Med - Identification of DNA-binding proteins using support vector machine with sequence information. ( 0,515915066048788 )