Brief. Bioinformatics - High-throughput DNA sequence data compression.

Tópicos

{ data(3008) multipl(1320) sourc(1022) }
{ research(1085) discuss(1038) issu(1018) }
{ gene(2352) biolog(1181) express(1162) }
{ imag(1947) propos(1133) code(1026) }
{ use(976) code(926) identifi(902) }
{ sequenc(1873) structur(1644) protein(1328) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ howev(809) still(633) remain(590) }
{ can(981) present(881) function(850) }
{ method(2212) result(1239) propos(1039) }
{ data(1737) use(1416) pattern(1282) }
{ featur(1941) imag(1645) propos(1176) }
{ sampl(1606) size(1419) use(1276) }
{ result(1111) use(1088) new(759) }
{ inform(2794) health(2639) internet(1427) }
{ bind(1733) structur(1185) ligand(1036) }
{ featur(3375) classif(2383) classifi(1994) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ clinic(1479) use(1117) guidelin(835) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ data(3963) clinic(1234) research(1004) }
{ state(1844) use(1261) util(961) }
{ cost(1906) reduc(1198) effect(832) }
{ process(1125) use(805) approach(778) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ group(2977) signific(1463) compar(1072) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

The exponential growth of high-throughput DNA sequence data has posed great challenges to genomic data storage, retrieval and transmission. Compression is a critical tool to address these challenges, where many methods have been developed to reduce the storage size of the genomes and sequencing data (reads, quality scores and metadata). However, genomic data are being generated faster than they could be meaningfully analyzed, leaving a large scope for developing novel compression algorithms that could directly facilitate data analysis beyond data transfer and storage. In this article, we categorize and provide a comprehensive review of the existing compression methods specialized for genomic data and present experimental results on compression ratio, memory usage, time for compression and decompression. We further present the remaining challenges and potential directions for future research.

Resumo Limpo

exponenti growth highthroughput dna sequenc data pose great challeng genom data storag retriev transmiss compress critic tool address challeng mani method develop reduc storag size genom sequenc data read qualiti score metadata howev genom data generat faster meaning analyz leav larg scope develop novel compress algorithm direct facilit data analysi beyond data transfer storag articl categor provid comprehens review exist compress method special genom data present experiment result compress ratio memori usag time compress decompress present remain challeng potenti direct futur research

Resumos Similares

Brief. Bioinformatics - Introduction into the analysis of high-throughput-sequencing based epigenome data. ( 0,724230830836068 )
Brief. Bioinformatics - A review of statistical methods for prediction of proteolytic cleavage. ( 0,681221329671936 )
Brief. Bioinformatics - Batch effect removal methods for microarray gene expression data integration: a survey. ( 0,674199862463242 )
J Chem Inf Model - Computational prediction of metabolism: sites, products, SAR, P450 enzyme dynamics, and mechanisms. ( 0,66902423723805 )
J Integr Bioinform - Noise tolerance of multiple classifier systems in data integration-based gene function prediction. ( 0,668171112628412 )
Wiley Interdiscip Rev Syst Biol Med - Subdiffractive microscopy: techniques, applications, and challenges. ( 0,665118369270512 )
Brief. Bioinformatics - Semantic similarity analysis of protein data: assessment with biological features and issues. ( 0,661856902597742 )
J Integr Bioinform - BioNetLink - an architecture for working with network data. ( 0,650863500759755 )
Brief. Bioinformatics - Learning transcriptional regulation on a genome scale: a theoretical analysis based on gene expression data. ( 0,617654216627675 )
Wiley Interdiscip Rev Syst Biol Med - Systems biology approaches to epidemiological studies of complex diseases. ( 0,613619349881168 )
Brief. Bioinformatics - Reconciliation of metabolites and biochemical reactions for metabolic networks. ( 0,61333151605746 )
J Integr Bioinform - Integrating phenotypic data for depression. ( 0,610306535122875 )
IEEE Trans Image Process - A novel video dataset for change detection benchmarking. ( 0,606597957318081 )
Brief. Bioinformatics - Rich annotation of DNA sequencing variants by leveraging the Ensembl Variant Effect Predictor with plugins. ( 0,602604607708341 )
Brief. Bioinformatics - Bioinformatics for personal genome interpretation. ( 0,593833537971786 )
Brief. Bioinformatics - The what, where, how and why of gene ontology--a primer for bioinformaticians. ( 0,593216788668166 )
Brief. Bioinformatics - Probabilistic graphical models for genetic association studies. ( 0,592580150879103 )
Brief. Bioinformatics - Advantages of mixing bioinformatics and visualization approaches for analyzing sRNA-mediated regulatory bacterial networks. ( 0,589216187702285 )
Brief. Bioinformatics - Survey of MapReduce frame operation in bioinformatics. ( 0,589204743731038 )
Wiley Interdiscip Rev Syst Biol Med - Genome network medicine: innovation to overcome huge challenges in cancer therapy. ( 0,587564798686131 )
Wiley Interdiscip Rev Syst Biol Med - Recent approaches to the prioritization of candidate disease genes. ( 0,583085475521705 )
Brief. Bioinformatics - From miRNA regulation to miRNA-TF co-regulation: computational approaches and challenges. ( 0,579429911966823 )
Brief. Bioinformatics - Environmental bio-monitoring with high-throughput sequencing. ( 0,57211366841457 )
Brief. Bioinformatics - Deciphering oncogenic drivers: from single genes to integrated pathways. ( 0,569636831622276 )
Brief. Bioinformatics - Bio/chemoinformatics in India: an outlook. ( 0,56774726759564 )
Wiley Interdiscip Rev Syst Biol Med - Quantitative analysis of phosphorylation-based protein signaling networks in the immune system by mass spectrometry. ( 0,567485177807867 )
Brief. Bioinformatics - The impact of HGT on phylogenomic reconstruction methods. ( 0,567452612490519 )
J Integr Bioinform - Quality controls in integrative approaches to detect errors and inconsistencies in biological databases. ( 0,567231129464659 )
Wiley Interdiscip Rev Syst Biol Med - Mechanisms controlling hematopoietic stem cell functions during normal hematopoiesis and hematological malignancies. ( 0,565213655929355 )
AMIA Annu Symp Proc - Similarity-based disease risk assessment for personal genomes: proof of concept. ( 0,563167469639598 )
Sci Data - Life history profiles for 27 strepsirrhine primate taxa generated using captive data from the Duke Lemur Center. ( 0,56214478337006 )
Brief. Bioinformatics - Visualizing time-related data in biology, a review. ( 0,554949164009859 )
Comput Biol Chem - New insights on gene regulation in archaea. ( 0,553797300295087 )
J. Comput. Biol. - Biological network querying techniques: analysis and comparison. ( 0,553051605875927 )
J Biomed Inform - Complementary ensemble clustering of biomedical data. ( 0,552783559009819 )
Comput Math Methods Med - First comprehensive in silico analysis of the functional and structural consequences of SNPs in human GalNAc-T1 gene. ( 0,552282925097073 )
Sci Data - Genomes and phenomes of a population of outbred rats and its progenitors. ( 0,551528208682229 )
IEEE J Biomed Health Inform - Integrative clustering by nonnegative matrix factorization can reveal coherent functional groups from gene profile data. ( 0,55069580921831 )
Wiley Interdiscip Rev Syst Biol Med - Genome-wide approaches in the study of microRNA biology. ( 0,548411232341605 )
J. Comput. Biol. - Bayesian blind source separation for data with network structure. ( 0,54512849106457 )
Brief. Bioinformatics - Advances in network-based metabolic pathway analysis and gene expression data integration. ( 0,543297131820843 )
Comput Biol Chem - Computational intelligence techniques in bioinformatics. ( 0,540285997734028 )
Brief. Bioinformatics - Machine learning approaches for the discovery of gene-gene interactions in disease data. ( 0,539419975738058 )
J Biomed Inform - DSGeo: software tools for cross-platform analysis of gene expression data in GEO. ( 0,538467050938834 )
IEEE Trans Image Process - Adaptive distributed source coding. ( 0,537685794312142 )
IEEE Trans Pattern Anal Mach Intell - Learning to Relate Images. ( 0,537420238566164 )
J Clin Monit Comput - Translational applications of evaluating physiologic variability in human endotoxemia. ( 0,537144850262883 )
Wiley Interdiscip Rev Syst Biol Med - Systems vaccinology: learning to compute the behavior of vaccine induced immunity. ( 0,534348368563171 )
Wiley Interdiscip Rev Syst Biol Med - Noncoding RNAs in gene regulation. ( 0,532677627224435 )
Neural Comput - Stochastic Hodgkin-Huxley equations with colored noise terms in the conductances. ( 0,532670042087382 )
Brief. Bioinformatics - Methodological aspects of whole-genome bisulfite sequencing analysis. ( 0,531894828550015 )
J Biomed Inform - Independent component analysis: mining microarray data for fundamental human gene expression modules. ( 0,531786092615258 )
J. Comput. Biol. - VERSE: a varying effect regression for splicing elements discovery. ( 0,531051861082264 )
Brief. Bioinformatics - Transcription factor and microRNA co-regulatory loops: important regulatory motifs in biological processes and diseases. ( 0,528206790486158 )
Brief. Bioinformatics - Current opportunities and challenges in microbial metagenome analysis--a bioinformatic perspective. ( 0,528134092664648 )
J Am Med Inform Assoc - Pharmacogenomics in the pocket of every patient? A prototype based on quick response codes. ( 0,526687431936495 )
Comput Biol Chem - Structural characteristics of genomic islands associated with GMP synthases as integration hotspot among sequenced microbial genomes. ( 0,525749727924586 )
Brief. Bioinformatics - Design and validation issues in RNA-seq experiments. ( 0,525176207131779 )
IEEE Trans Image Process - A weighted optimization approach to time-of-flight sensor fusion. ( 0,524501771242854 )
Wiley Interdiscip Rev Syst Biol Med - Large-scale mouse knockouts and phenotypes. ( 0,52265581894883 )
Brief. Bioinformatics - Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies. ( 0,522588341507541 )
Wiley Interdiscip Rev Syst Biol Med - Using systems approaches to address challenges for clinical implementation of pharmacogenomics. ( 0,520453824438041 )
Wiley Interdiscip Rev Syst Biol Med - Whole transcriptome analysis: what are we still missing? ( 0,520367318134167 )
Brief. Bioinformatics - Functional assignment of metagenomic data: challenges and applications. ( 0,519899555754115 )
Sci Data - DNA methylation temporal profiling following peripheral versus central nervous system axotomy. ( 0,518599512673753 )
IEEE Trans Neural Netw Learn Syst - A Distributed Approach Toward Discriminative Distance Metric Learning. ( 0,517701797867413 )
Artif Intell Med - An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. ( 0,516650111676248 )
IEEE J Biomed Health Inform - Using evolutional properties of gene networks in understanding survival prognosis of glioblastoma. ( 0,516244163632812 )
Brief. Bioinformatics - Information theory applications for biological sequence analysis. ( 0,516139774092509 )
BMC Med Inform Decis Mak - A multiscale and multiparametric approach for modeling the progression of oral cancer. ( 0,51574576418809 )
Brief. Bioinformatics - Modern bioinformatics meets traditional Chinese medicine. ( 0,515527689947041 )
Brief. Bioinformatics - Identification of aberrant pathways and network activities from high-throughput data. ( 0,515407258423737 )
Comput Biol Chem - Exploring the complexity of pathway-drug relationships using latent Dirichlet allocation. ( 0,511043728552305 )
Int J Comput Assist Radiol Surg - Enhanced visualisation for minimally invasive surgery. ( 0,510555402049137 )
J Integr Bioinform - Using surveys of Affymetrix GeneChips to study antisense expression. ( 0,510266276101379 )
IEEE Trans Image Process - Blind separation of image sources via adaptive dictionary learning. ( 0,509403788258607 )
Wiley Interdiscip Rev Syst Biol Med - The zebrafish: scalable in vivo modeling for systems biology. ( 0,50862852038246 )
Wiley Interdiscip Rev Syst Biol Med - The molecular circuitry underlying pluripotency in embryonic stem cells. ( 0,508513380930983 )
Wiley Interdiscip Rev Syst Biol Med - Bioimage informatics for understanding spatiotemporal dynamics of cellular processes. ( 0,50805024239047 )
J. Comput. Biol. - Stochastic simulation of notch signaling reveals novel factors that mediate the differentiation of neural stem cells. ( 0,507861502003371 )
J. Comput. Biol. - Computational disease gene prioritization: an appraisal. ( 0,507700456638099 )
IEEE Trans Vis Comput Graph - Bristle Maps: A Multivariate Abstraction Technique for Geovisualization. ( 0,507343871096067 )
Comput Biol Chem - Meta-analysis of microarray data: The case of imatinib resistance in chronic myelogenous leukemia. ( 0,506767144830222 )
J Biomed Inform - Harmonization and semantic annotation of data dictionaries from the Pharmacogenomics Research Network: a case study. ( 0,505530872033555 )
Comput. Biol. Med. - Accelerating in silico research with workflows: a lesson in Simplicity. ( 0,504535479436873 )
Brief. Bioinformatics - Toward microRNA-mediated gene regulatory networks in plants. ( 0,504429464330168 )
Brief. Bioinformatics - Experimental evidence validating the computational inference of functional associations from gene fusion events: a critical survey. ( 0,504328287752729 )
Methods Inf Med - Health data cooperatives - citizen empowerment. ( 0,503613745555486 )
J Am Med Inform Assoc - Utility of gene-specific algorithms for predicting pathogenicity of uncertain gene variants. ( 0,50260023358881 )
AMIA Annu Symp Proc - Graphical methods for reducing, visualizing and analyzing large data sets using hierarchical terminologies. ( 0,501888214474026 )
Wiley Interdiscip Rev Syst Biol Med - Using variability in gene expression as a tool for studying gene regulation. ( 0,500938513132494 )
Brief. Bioinformatics - A practical guide for the functional annotation of genetic variations using SNPnexus. ( 0,50088117739343 )
Brief. Bioinformatics - Knowledge representation in metabolic pathway databases. ( 0,500639213955254 )
Med Biol Eng Comput - Micro/nano-fabrication technologies for cell biology. ( 0,500406338953695 )
IEEE Trans Image Process - Orientation modulation for data hiding in clustered-dot halftone prints. ( 0,499208506670272 )
J Biomed Inform - Where we stand, where we are moving: Surveying computational techniques for identifying miRNA genes and uncovering their regulatory role. ( 0,499055135135336 )
IEEE Trans Image Process - Binned progressive quantization for compressive sensing. ( 0,4986307934182 )
Wiley Interdiscip Rev Syst Biol Med - Cardiac function and disease: emerging role of small ubiquitin-related modifier. ( 0,497814069526814 )
J Integr Bioinform - Construction of coffee transcriptome networks based on gene annotation semantics. ( 0,497436190301977 )
Artif Intell Med - Detecting disease genes based on semi-supervised learning and protein-protein interaction networks. ( 0,496641291309903 )