Brief. Bioinformatics - Classification of metagenomic sequences: methods and challenges.

Tópicos

{ sequenc(1873) structur(1644) protein(1328) }
{ algorithm(1844) comput(1787) effici(935) }
{ research(1085) discuss(1038) issu(1018) }
{ process(1125) use(805) approach(778) }
{ control(1307) perform(991) simul(935) }
{ howev(809) still(633) remain(590) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ result(1111) use(1088) new(759) }
{ data(1737) use(1416) pattern(1282) }
{ imag(2675) segment(2577) method(1081) }
{ assess(1506) score(1403) qualiti(1306) }
{ framework(1458) process(801) describ(734) }
{ concept(1167) ontolog(924) domain(897) }
{ extract(1171) text(1153) clinic(932) }
{ data(1714) softwar(1251) tool(1186) }
{ health(3367) inform(1360) care(1135) }
{ data(2317) use(1299) case(1017) }
{ data(3008) multipl(1320) sourc(1022) }
{ use(1733) differ(960) four(931) }
{ implement(1333) system(1263) develop(1122) }
{ can(774) often(719) complex(702) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ learn(2355) train(1041) set(1003) }
{ care(1570) inform(1187) nurs(1089) }
{ data(3963) clinic(1234) research(1004) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ state(1844) use(1261) util(961) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ bind(1733) structur(1185) ligand(1036) }
{ method(1219) similar(1157) match(930) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ clinic(1479) use(1117) guidelin(835) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ general(901) number(790) one(736) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ drug(1928) target(777) effect(648) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Characterizing the taxonomic diversity of microbial communities is one of the primary objectives of metagenomic studies. Taxonomic analysis of microbial communities, a process referred to as binning, is challenging for the following reasons. Primarily, query sequences originating from the genomes of most microbes in an environmental sample lack taxonomically related sequences in existing reference databases. This absence of a taxonomic context makes binning a very challenging task. Limitations of current sequencing platforms, with respect to short read lengths and sequencing errors/artifacts, are also key factors that determine the overall binning efficiency. Furthermore, the sheer volume of metagenomic datasets also demands highly efficient algorithms that can operate within reasonable requirements of compute power. This review discusses the premise, methodologies, advantages, limitations and challenges of various methods available for binning of metagenomic datasets obtained using the shotgun sequencing approach. Various parameters as well as strategies used for evaluating binning efficiency are then reviewed.

Resumo Limpo

character taxonom divers microbi communiti one primari object metagenom studi taxonom analysi microbi communiti process refer bin challeng follow reason primarili queri sequenc origin genom microb environment sampl lack taxonom relat sequenc exist refer databas absenc taxonom context make bin challeng task limit current sequenc platform respect short read length sequenc errorsartifact also key factor determin overal bin effici furthermor sheer volum metagenom dataset also demand high effici algorithm can oper within reason requir comput power review discuss premis methodolog advantag limit challeng various method avail bin metagenom dataset obtain use shotgun sequenc approach various paramet well strategi use evalu bin effici review

Resumos Similares

J. Comput. Biol. - Simultaneous alignment and folding of protein sequences. ( 0,645315969310948 )
J. Comput. Biol. - Using structural and evolutionary information to detect and correct pyrosequencing errors in noncoding RNAs. ( 0,626020787817653 )
Brief. Bioinformatics - Alpha shape and Delaunay triangulation in studies of protein-related interactions. ( 0,622053646535605 )
Brief. Bioinformatics - Methodological aspects of whole-genome bisulfite sequencing analysis. ( 0,619878338494203 )
Brief. Bioinformatics - Survey of MapReduce frame operation in bioinformatics. ( 0,619375983354817 )
Sci Data - A draft genome for the African crocodilian trypanosome Trypanosoma grayi. ( 0,615943150714642 )
J. Comput. Biol. - LB3D: a protein three-dimensional substructure search program based on the lower bound of a root mean square deviation value. ( 0,614985253989495 )
Brief. Bioinformatics - Pattern recognition and probabilistic measures in alignment-free sequence analysis. ( 0,610442734285402 )
J Chem Inf Model - Parallel and antiparallel ?-strands differ in amino acid composition and availability of short constituent sequences. ( 0,605195824136266 )
Comput. Biol. Med. - GPU-based acceleration of an RNA tertiary structure prediction algorithm. ( 0,601702189215411 )
Comput. Biol. Med. - A data parallel strategy for aligning multiple biological sequences on multi-core computers. ( 0,600966318344708 )
J. Comput. Biol. - Computing the probability of RNA hairpin and multiloop formation. ( 0,599467610940124 )
Comput Biol Chem - ProCoCoA: A quantitative approach for analyzing protein core composition. ( 0,597543731054038 )
Comput Biol Chem - Parallel molecular computation of modular-multiplication with two same inputs over finite field GF(2(n)) using self-assembly of DNA tiles. ( 0,59182982560032 )
J Integr Bioinform - High performance pattern matching on heterogeneous platform. ( 0,587589616730336 )
Brief. Bioinformatics - De novo assembly of short sequence reads. ( 0,587456049782659 )
Brief. Bioinformatics - A practical guide for the computational selection of residues to be experimentally characterized in protein families. ( 0,58296089431701 )
Comput Math Methods Med - Uses of phage display in agriculture: sequence analysis and comparative modeling of late embryogenesis abundant client proteins suggest protein-nucleic acid binding functionality. ( 0,581775625731009 )
Brief. Bioinformatics - Base-calling for next-generation sequencing platforms. ( 0,577923580363256 )
Brief. Bioinformatics - A survey on prediction of specificity-determining sites in proteins. ( 0,573022455791339 )
Comput Math Methods Med - Bioinformatics resources and tools for conformational B-cell epitope prediction. ( 0,572956300329648 )
Brief. Bioinformatics - Challenges of sequencing human genomes. ( 0,572393427956187 )
Brief. Bioinformatics - DRISEE overestimates errors in metagenomic sequencing data. ( 0,571566623170988 )
J Chem Inf Model - Proteins as sponges: a statistical journey along protein structure organization principles. ( 0,571226279277847 )
Brief. Bioinformatics - Ortholog identification in the presence of domain architecture rearrangement. ( 0,570220123180479 )
Med Biol Eng Comput - Enhanced spatio-temporal alignment of plantar pressure image sequences using B-splines. ( 0,568431674313022 )
Sci Data - Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus. ( 0,565947236103773 )
Comput Biol Chem - The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam. ( 0,565594643392381 )
J Integr Bioinform - Rapid development of Proteomic applications with the AIBench framework. ( 0,56558540715407 )
J. Comput. Biol. - A geometric arrangement algorithm for structure determination of symmetric protein homo-oligomers from NOEs and RDCs. ( 0,564571440053427 )
J. Comput. Biol. - Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. ( 0,564206229834341 )
Brief. Bioinformatics - Automated glycopeptide analysis--review of current state and future directions. ( 0,564086089034944 )
Comput Biol Chem - Protein folding simulations of 2D HP model by the genetic algorithm based on optimal secondary structures. ( 0,563408568664375 )
Comput. Biol. Med. - Accelerating in silico research with workflows: a lesson in Simplicity. ( 0,562667978879664 )
BMC Med Inform Decis Mak - Efficient protein structure search using indexing methods. ( 0,561127686008834 )
J. Comput. Biol. - Efficient traversal of beta-sheet protein folding pathways using ensemble models. ( 0,560587973390019 )
Comput Biol Chem - Automated prediction of three-way junction topological families in RNA secondary structures. ( 0,560018632309822 )
J Integr Bioinform - Exceptional single strand DNA word symmetry: analysis of evolutionary potentialities. ( 0,558737422786201 )
Brief. Bioinformatics - LC-MS alignment in theory and practice: a comprehensive algorithmic review. ( 0,557285276700195 )
J Biomed Inform - Inferring characteristic phenotypes via class association rule mining in the bone dysplasia domain. ( 0,556420437565735 )
J Chem Inf Model - Protein structural statistics with PSS. ( 0,556141376452211 )
J. Comput. Biol. - Multiplex de novo sequencing of peptide antibiotics. ( 0,555488386010693 )
Brief. Bioinformatics - A survey of sequence alignment algorithms for next-generation sequencing. ( 0,554523756491945 )
Brief. Bioinformatics - Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. ( 0,55340947081633 )
J Chem Inf Model - Self-contained sequence representation: bridging the gap between bioinformatics and cheminformatics. ( 0,553323777472998 )
J Chem Inf Model - ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures. ( 0,552966782383591 )
Comput Biol Chem - Comparison of linear gap penalties and profile-based variable gap penalties in profile-profile alignments. ( 0,551444423890095 )
J Chem Inf Model - Improved helix and kink characterization in membrane proteins allows evaluation of kink sequence predictors. ( 0,548510831849326 )
J. Comput. Biol. - Optimization of profile-to-profile alignment parameters for one-dimensional threading. ( 0,546767683796984 )
IEEE Trans Image Process - Masked object registration in the Fourier domain. ( 0,545781618615556 )
Curr Protoc Bioinformatics - Clustal omega. ( 0,540162142412991 )
Brief. Bioinformatics - Congruency in the prediction of pathogenic missense mutations: state-of-the-art web-based tools. ( 0,539125017428692 )
Brief. Bioinformatics - Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. ( 0,538241355922632 )
Comput. Biol. Med. - ExonSuite: algorithmically optimizing alternative gene splicing for the PUF proteins. ( 0,538070608651361 )
J. Comput. Biol. - Evaluating, comparing, and interpreting protein domain hierarchies. ( 0,535967388939315 )
Comput Biol Chem - Characterizing regions in the human genome unmappable by next-generation-sequencing at the read length of 1000 bases. ( 0,535861600199893 )
J. Comput. Biol. - Combinatorics of -structures. ( 0,535489919392563 )
Brief. Bioinformatics - A review of statistical methods for prediction of proteolytic cleavage. ( 0,535363316693093 )
J. Comput. Biol. - Statistical significance of threading scores. ( 0,535013464666219 )
J. Comput. Biol. - ComB: SNP calling and mapping analysis for color and nucleotide space platforms. ( 0,534982850963322 )
J. Comput. Biol. - Reconstructing the history of large-scale genomic changes: biological questions and computational challenges. ( 0,534543214681331 )
J Chem Inf Model - Development of an informatics platform for therapeutic protein and peptide analytics. ( 0,533607170583976 )
Comput Biol Chem - Heuristic-based tabu search algorithm for folding two-dimensional AB off-lattice model proteins. ( 0,53323903565173 )
Curr Comput Aided Drug Des - Structure-Guided Design of Antibodies. ( 0,53323903565173 )
J. Comput. Biol. - A probabilistic model for sequence alignment with context-sensitive indels. ( 0,532379060539169 )
Brief. Bioinformatics - Benchmarking of viral haplotype reconstruction programmes: an overview of the capacities and limitations of currently available programmes. ( 0,530869259903388 )
Brief. Bioinformatics - Design and validation issues in RNA-seq experiments. ( 0,530575533847287 )
Curr Protoc Bioinformatics - Using the RNAstructure Software Package to Predict Conserved RNA Structures. ( 0,529849435398386 )
Brief. Bioinformatics - Sequence analysis by iterated maps, a review. ( 0,529547259264231 )
J Chem Inf Model - Protein secondary structure prediction with SPARROW. ( 0,528314373368894 )
Brief. Bioinformatics - Compressive biological sequence analysis and archival in the era of high-throughput sequencing technologies. ( 0,527082651681518 )
Comput Biol Chem - Statistical analysis and exposure status classification of transmembrane beta barrel residues. ( 0,52694715302133 )
J Chem Inf Model - Structural effects of pH and deacylation on surfactant protein C in an organic solvent mixture: a constant-pH MD study. ( 0,526354283272064 )
Comput Biol Chem - Bacterial protein structures reveal phylum dependent divergence. ( 0,525288454695433 )
J Chem Inf Model - Discovery of novel promising targets for anti-AIDS drug developments by computer modeling: application to the HIV-1 gp120 V3 loop. ( 0,524994462978222 )
IEEE Trans Image Process - A 124 Mpixels/s VLSI design for histogram-based joint bilateral filtering. ( 0,523078880436957 )
Comput Biol Chem - Computational insight into nitration of human myoglobin. ( 0,520162883772863 )
IEEE Trans Vis Comput Graph - Energy Conservation for the Simulation of Deformable Bodies. ( 0,519276007680298 )
Comput. Biol. Med. - An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. ( 0,518661947861697 )
Comput Biol Chem - Identification and characterization of lysine-methylated sites on histones and non-histone proteins. ( 0,518591118307435 )
Comput Biol Chem - An efficient similarity search based on indexing in large DNA databases. ( 0,517944783864987 )
J. Comput. Biol. - Parallel continuous flow: a parallel suffix tree construction tool for whole genomes. ( 0,517837372036904 )
Brief. Bioinformatics - Sequencing technologies and tools for short tandem repeat variation detection. ( 0,517692664217872 )
J. Comput. Biol. - SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. ( 0,51759616207126 )
J Chem Inf Model - LocaPep: localization of epitopes on protein surfaces using peptides from phage display libraries. ( 0,516184524156096 )
Comput Math Methods Med - ADLD: a novel graphical representation of protein sequences and its application. ( 0,515767755534723 )
Brief. Bioinformatics - The challenges of delivering bioinformatics training in the analysis of high-throughput data. ( 0,515664737279895 )
IEEE Trans Pattern Anal Mach Intell - Spatiotemporal Alignment of Visual Signals on a Special Manifold. ( 0,515224363515537 )
Comput Biol Chem - The frequency of poly(G) tracts in the human genome and their use as a sensor of DNA damage. ( 0,515097131913766 )
Comput Methods Programs Biomed - Quantitative thermodynamic predication of interactions between nucleic acid and non-nucleic acid species using Microsoft excel. ( 0,51422927271958 )
Med Biol Eng Comput - The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: semi-supervised classification of class C GPCRs. ( 0,51414828535429 )
Brief. Bioinformatics - The automatic annotation of bacterial genomes. ( 0,513657908279657 )
Brief. Bioinformatics - Systematic identification of Class I HDAC substrates. ( 0,513330002817789 )
Sci Data - Long-read, whole-genome shotgun sequence data for five model organisms. ( 0,513202371653872 )
Brief. Bioinformatics - Reference databases for taxonomic assignment in metagenomics. ( 0,512737195817288 )
Brief. Bioinformatics - Making the difference: integrating structural variation detection tools. ( 0,512515175143848 )
Comput Biol Chem - Investigating long range correlation in DNA sequences using significance tests of conditional mutual information. ( 0,512366599116541 )
Brief. Bioinformatics - Identify drug repurposing candidates by mining the protein data bank. ( 0,511297262179934 )
Artif Intell Med - Memetic algorithms for de novo motif-finding in biomedical sequences. ( 0,50995106297801 )
Comput Biol Chem - Genome-wide analysis and evolutionary study of sucrose non-fermenting 1-related protein kinase 2 (SnRK2) gene family members in Arabidopsis and Oryza. ( 0,509326341169053 )