J. Comput. Biol. - Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.

Tópicos

{ structur(1116) can(940) graph(676) }
{ sequenc(1873) structur(1644) protein(1328) }
{ measur(2081) correl(1212) valu(896) }
{ method(1219) similar(1157) match(930) }
{ first(2504) two(1366) second(1323) }
{ framework(1458) process(801) describ(734) }
{ search(2224) databas(1162) retriev(909) }
{ research(1085) discuss(1038) issu(1018) }
{ can(981) present(881) function(850) }
{ take(945) account(800) differ(722) }
{ clinic(1479) use(1117) guidelin(835) }
{ drug(1928) target(777) effect(648) }
{ data(1714) softwar(1251) tool(1186) }
{ patient(2837) hospit(1953) medic(668) }
{ detect(2391) sensit(1101) algorithm(908) }
{ can(774) often(719) complex(702) }
{ bind(1733) structur(1185) ligand(1036) }
{ imag(2830) propos(1344) filter(1198) }
{ error(1145) method(1030) estim(1020) }
{ extract(1171) text(1153) clinic(932) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ howev(809) still(633) remain(590) }
{ spatial(1525) area(1432) region(1030) }
{ state(1844) use(1261) util(961) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ sampl(1606) size(1419) use(1276) }
{ time(1939) patient(1703) rate(768) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ general(901) number(790) one(736) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ patient(1821) servic(1111) care(1106) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ method(1969) cluster(1462) data(1082) }

Resumo

The recent proliferation of next generation sequencing with short reads has enabled many new experimental opportunities but, at the same time, has raised formidable computational challenges in genome assembly. One of the key advances that has led to an improvement in contig lengths has been mate pairs, which facilitate the assembly of repeating regions. Mate pairs have been algorithmically incorporated into most next generation assemblers as various heuristic post-processing steps to correct the assembly graph or to link contigs into scaffolds. Such methods have allowed the identification of longer contigs than would be possible with single reads; however, they can still fail to resolve complex repeats. Thus, improved methods for incorporating mate pairs will have a strong effect on contig length in the future. Here, we introduce the paired de Bruijn graph, a generalization of the de Bruijn graph that incorporates mate pair information into the graph structure itself instead of analyzing mate pairs at a post-processing step. This graph has the potential to be used in place of the de Bruijn graph in any de Bruijn graph based assembler, maintaining all other assembly steps such as error-correction and repeat resolution. Through assembly results on simulated perfect data, we argue that this can effectively improve the contig sizes in assembly.

Resumo Limpo

recent prolifer next generat sequenc short read enabl mani new experiment opportun time rais formid comput challeng genom assembl one key advanc led improv contig length mate pair facilit assembl repeat region mate pair algorithm incorpor next generat assembl various heurist postprocess step correct assembl graph link contig scaffold method allow identif longer contig possibl singl read howev can still fail resolv complex repeat thus improv method incorpor mate pair will strong effect contig length futur introduc pair de bruijn graph general de bruijn graph incorpor mate pair inform graph structur instead analyz mate pair postprocess step graph potenti use place de bruijn graph de bruijn graph base assembl maintain assembl step errorcorrect repeat resolut assembl result simul perfect data argu can effect improv contig size assembl

Resumos Similares

J. Comput. Biol. - Pathset graphs: a novel approach for comprehensive utilization of paired reads in genome assembly. ( 0,744214313927746 )
IEEE Trans Vis Comput Graph - Output-Sensitive Construction of Reeb Graphs. ( 0,715386388978111 )
Comput Biol Chem - On topological indices for small RNA graphs. ( 0,708562297471901 )
IEEE Trans Pattern Anal Mach Intell - The Sum-over-Forests Density Index: Identifying Dense Regions in a Graph. ( 0,676218381980587 )
J. Comput. Biol. - Random matrix approach to the distribution of genomic distance. ( 0,67068086638736 )
Comput. Biol. Med. - A protein mapping method based on physicochemical properties and dimension reduction. ( 0,669146452979281 )
Brief. Bioinformatics - Computational methods for Gene Orthology inference. ( 0,661932117991996 )
Comput Biol Chem - Exploring the limits of fold discrimination by structural alignment: a large scale benchmark using decoys of known fold. ( 0,659932772772986 )
IEEE Trans Image Process - Complex object correspondence construction in two-dimensional animation. ( 0,659918337200177 )
J. Comput. Biol. - Parallel continuous flow: a parallel suffix tree construction tool for whole genomes. ( 0,652268767805531 )
J. Comput. Biol. - Simultaneous folding of alternative RNA structures with mutual constraints: an application to next-generation sequencing-based RNA structure probing. ( 0,64950707119207 )
J. Comput. Biol. - Detection of structural variants involving repetitive regions in the reference genome. ( 0,643143271289328 )
AMIA Annu Symp Proc - Synergism between the mapping projects from SNOMED CT to ICD-10 and ICD-10-CM. ( 0,634544378346853 )
J. Comput. Biol. - Smoothing 3D protein structure motifs through graph mining and amino acid similarities. ( 0,634507580148997 )
J. Comput. Biol. - A theoretical model for whole genome alignment. ( 0,633040502405956 )
J Chem Inf Model - Beyond terrestrial biology: charting the chemical universe of a-amino acid structures. ( 0,632468694890625 )
J. Comput. Biol. - Counting RNA pseudoknotted structures. ( 0,626798284480592 )
J Chem Inf Model - Protein secondary structure classification revisited: processing DSSP information with PSSC. ( 0,624570730314238 )
Comput Biol Chem - Subgrouping Automata: automatic sequence subgrouping using phylogenetic tree-based optimum subgrouping algorithm. ( 0,623606384633299 )
J. Comput. Biol. - The approximability of shortest path-based graph orientations of protein-protein interaction networks. ( 0,623087003156543 )
J. Comput. Biol. - Shapes of RNA pseudoknot structures. ( 0,617370476031788 )
IEEE Trans Vis Comput Graph - Dynamic Network Visualization with Extended Massive Sequence Views. ( 0,613816947607915 )
IEEE Trans Image Process - Constrained and dimensionality-independent path openings. ( 0,6137120126644 )
Neural Comput - Intrinsic graph structure estimation using graph Laplacian. ( 0,600713409792721 )
J Biomed Inform - Tree kernel-based protein-protein interaction extraction from biomedical literature. ( 0,596464910910613 )
Comput Biol Chem - A novel feature representation method based on Chou's pseudo amino acid composition for protein structural class prediction. ( 0,594161059639615 )
IEEE Trans Pattern Anal Mach Intell - A Robust O(n) Solution to the Perspective-n-Point Problem. ( 0,592704375981354 )
IEEE Trans Image Process - Connected filtering based on multivalued component-trees. ( 0,589681590882458 )
Comput Biol Chem - A degree-distribution based hierarchical agglomerative clustering algorithm for protein complexes identification. ( 0,589672857077936 )
J. Comput. Biol. - Gene prediction based on DNA spectral analysis: a literature review. ( 0,586702884011929 )
IEEE Trans Vis Comput Graph - Graph Drawing Aesthetics — Created by Users not Algorithms. ( 0,582745087267747 )
IEEE Trans Image Process - A co-saliency model of image pairs. ( 0,581430940062479 )
IEEE Trans Image Process - Hyperspectral image representation and processing with binary partition trees. ( 0,581119493969437 )
J Chem Inf Model - Comparative analysis of threshold and tessellation methods for determining protein contacts. ( 0,58101479707807 )
J Biomed Inform - A similarity network approach for the analysis and comparison of protein sequence/structure sets. ( 0,578466581031138 )
J Chem Inf Model - Searching for likeness in a database of macromolecular complexes. ( 0,577830269067131 )
IEEE Trans Image Process - 3-D curvilinear structure detection filter via structure-ball analysis. ( 0,577534893274457 )
J. Comput. Biol. - A polynomial-time algorithm computing lower and upper bounds of the rooted subtree prune and regraft distance. ( 0,577113553202148 )
J Chem Inf Model - Stereochemically consistent reaction mapping and identification of multiple reaction mechanisms through integer linear optimization. ( 0,575982138666744 )
J. Comput. Biol. - An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure. ( 0,575265851732563 )
Comput Math Methods Med - DV-curve representation of protein sequences and its application. ( 0,572104418585457 )
Comput Biol Chem - Identification of putative and potential cross-reactive chickpea (Cicer arietinum) allergens through an in silico approach. ( 0,569560084831551 )
Med Biol Eng Comput - Gaitography applied to prosthetic walking. ( 0,567907502895812 )
IEEE Trans Vis Comput Graph - The Design Space of Implicit Hierarchy Visualization: A Survey. ( 0,566931824920722 )
J Chem Inf Model - Characterization of heterocyclic rings through quantum chemical topology. ( 0,565141427899914 )
Comput Biol Chem - Tracing the evolution of the mitochondrial protein import machinery. ( 0,557799104115097 )
J. Comput. Biol. - A Bayesian sampler for optimization of protein domain hierarchies. ( 0,557402154567664 )
Brief. Bioinformatics - Ultrafast clustering algorithms for metagenomic sequence analysis. ( 0,553898758315461 )
J. Comput. Biol. - The generating function approach for Peptide identification in spectral networks. ( 0,551753658135628 )
IEEE Trans Vis Comput Graph - Link Conditions for Simplifying Meshes with Embedded Structures. ( 0,551293113433174 )
J Chem Inf Model - Time-averaged distributions of solute and solvent motions: exploring proton wires of GFP and PfM2DH. ( 0,549689457406082 )
J Chem Inf Model - LocaPep: localization of epitopes on protein surfaces using peptides from phage display libraries. ( 0,549021403430232 )
IEEE Trans Image Process - Histogram contextualization. ( 0,548730694187119 )
J. Comput. Biol. - Modeling alternative splicing variants from RNA-Seq data with isoform graphs. ( 0,548366717833784 )
Int J Comput Assist Radiol Surg - Evolutionistic or revolutionary paths? A PACS maturity model for strategic situational planning. ( 0,545756231376862 )
J. Comput. Biol. - Combinatorics of -structures. ( 0,544791647240669 )
Comput Methods Programs Biomed - TreeVis: a MATLAB-based tool for tree visualization. ( 0,543582832218141 )
J Chem Inf Model - Template CoMFA: the 3D-QSAR Grail? ( 0,54277407286841 )
Comput. Biol. Med. - Automating fault tolerance in high-performance computational biological jobs using multi-agent approaches. ( 0,54211872468097 )
IEEE Trans Image Process - Stereo matching and view interpolation based on image domain triangulation. ( 0,541928769578026 )
IEEE Trans Vis Comput Graph - Grouper: A Compact, Streamable Triangle Mesh Data Structure. ( 0,541273436422787 )
J Chem Inf Model - Modules identification in protein structures: the topological and geometrical solutions. ( 0,535436192177885 )
J. Comput. Biol. - Statistical significance of optical map alignments. ( 0,535015911400541 )
Brief. Bioinformatics - Structural mapping: how to study the genetic architecture of a phenotypic trait through its formation mechanism. ( 0,533095433656406 )
IEEE Trans Image Process - W-tree indexing for fast visual word generation. ( 0,531707869008026 )
J Chem Inf Model - PocketAlign a novel algorithm for aligning binding sites in protein structures. ( 0,530758992461488 )
J Integr Bioinform - Complementarity of network and sequence information in homologous proteins. ( 0,529398056098399 )
Comput. Biol. Med. - Hyperbolic Dirac Nets for medical decision support. Theory, methods, and comparison with Bayes Nets. ( 0,527321172798587 )
IEEE Trans Pattern Anal Mach Intell - Building Development Monitoring in Multitemporal Remotely Sensed Image Pairs with Stochastic Birth-Death Dynamics. ( 0,524207424242191 )
IEEE Trans Vis Comput Graph - Visual Analysis of Large Graphs Using (X,Y)-clustering and Hybrid Visualizations. ( 0,522656518591916 )
IEEE Trans Image Process - Topology preserving warping of 3-D binary images according to continuous one-to-one mappings. ( 0,522569687013091 )
J. Comput. Biol. - Sequence alignment of viral channel proteins with cellular ion channels. ( 0,521452149497489 )
J Chem Inf Model - MetalS2: a tool for the structural alignment of minimal functional sites in metal-binding proteins and nucleic acids. ( 0,520966165494192 )
IEEE Trans Vis Comput Graph - Flow Visualization with Quantified Spatial and Temporal Errors Using Edge Maps. ( 0,519019950900789 )
IEEE Trans Vis Comput Graph - Image-Based Modeling of Unwrappable Fa?ades. ( 0,516802406019298 )
IEEE Trans Pattern Anal Mach Intell - Free Energy Score Spaces: Using Generative Information in Discriminative Classifiers. ( 0,516764548010098 )
J Chem Inf Model - Mapping monomeric threading to protein-protein structure prediction. ( 0,513557086376745 )
J Med Syst - Manual refinement system for graph-based segmentation results in the medical domain. ( 0,510480103803674 )
Comput Biol Chem - Understanding the general packing rearrangements required for successful template based modeling of protein structure from a CASP experiment. ( 0,509754177015444 )
BMC Med Inform Decis Mak - Efficient protein structure search using indexing methods. ( 0,508411779185669 )
Comput. Biol. Med. - Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches. ( 0,507110763116786 )
Comput Biol Chem - Large replication skew domains delimit GC-poor gene deserts in human. ( 0,506844890712722 )
IEEE Trans Pattern Anal Mach Intell - Trinary-Projection Trees for Approximate Nearest Neighbor Search. ( 0,505745691088482 )
J Chem Inf Model - sc-PDB-Frag: a database of protein-ligand interaction patterns for Bioisosteric replacements. ( 0,50549324816903 )
Brief. Bioinformatics - BamView: visualizing and interpretation of next-generation sequencing read alignments. ( 0,505357403646176 )
J Chem Inf Model - Addressing challenges of identifying geometrically diverse sets of crystalline porous materials. ( 0,504168540842553 )
Neural Comput - Temporal order detection and coding in nervous systems. ( 0,50352711383627 )
Comput Biol Chem - Predicting protein-protein interactions using graph invariants and a neural network. ( 0,502882518850277 )
J Chem Inf Model - Atom environment kernels on molecules. ( 0,502004701198021 )
IEEE Trans Image Process - One-dimensional mapping for estimating projective transformations. ( 0,500472158373193 )
Comput Math Methods Med - Coarse-grained simulation of myosin-V movement. ( 0,499230176602706 )
IEEE Trans Image Process - The Roadmaker's algorithm for the discrete pulse transform. ( 0,499205280619144 )
Comput Math Methods Med - How the statistical validation of functional connectivity patterns can prevent erroneous definition of small-world properties of a brain connectivity network. ( 0,498798736378135 )
Comput Biol Chem - Heuristic energy landscape paving for protein folding problem in the three-dimensional HP lattice model. ( 0,498346548678365 )
Neural Comput - Parametric inference in the large data limit using maximally informative models. ( 0,497864644222468 )
J. Comput. Biol. - Optimization of profile-to-profile alignment parameters for one-dimensional threading. ( 0,496874579490586 )
IEEE Trans Image Process - Toward a unified color space for perception-based image processing. ( 0,496847638538834 )
J Biomed Inform - Decision support from local data: creating adaptive order menus from past clinician behavior. ( 0,496514766071778 )
J Chem Inf Model - Graph mining for SAR transfer series. ( 0,496346283230253 )
Comput. Biol. Med. - Application of 2D graphic representation of protein sequence based on Huffman tree method. ( 0,496326960783745 )