J. Comput. Biol. - eALPS: estimating abundance levels in pooled sequencing using available genotyping data.

Tópicos

{ learn(2355) train(1041) set(1003) }
{ gene(2352) biolog(1181) express(1162) }
{ sequenc(1873) structur(1644) protein(1328) }
{ data(2317) use(1299) case(1017) }
{ estim(2440) model(1874) function(577) }
{ model(3404) distribut(989) bayesian(671) }
{ howev(809) still(633) remain(590) }
{ take(945) account(800) differ(722) }
{ chang(1828) time(1643) increas(1301) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ import(1318) role(1303) understand(862) }
{ sampl(1606) size(1419) use(1276) }
{ implement(1333) system(1263) develop(1122) }
{ patient(2315) diseas(1263) diabet(1191) }
{ extract(1171) text(1153) clinic(932) }
{ general(901) number(790) one(736) }
{ risk(3053) factor(974) diseas(938) }
{ health(3367) inform(1360) care(1135) }
{ analysi(2126) use(1163) compon(1037) }
{ result(1111) use(1088) new(759) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ imag(2830) propos(1344) filter(1198) }
{ algorithm(1844) comput(1787) effici(935) }
{ control(1307) perform(991) simul(935) }
{ research(1085) discuss(1038) issu(1018) }
{ visual(1396) interact(850) tool(830) }
{ spatial(1525) area(1432) region(1030) }
{ cost(1906) reduc(1198) effect(832) }
{ first(2504) two(1366) second(1323) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ detect(2391) sensit(1101) algorithm(908) }
{ can(774) often(719) complex(702) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ method(1219) similar(1157) match(930) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ model(2341) predict(2261) use(1141) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ group(2977) signific(1463) compar(1072) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ survey(1388) particip(1329) question(1065) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }

Resumo

The recent advances in high-throughput sequencing technologies bring the potential of a better characterization of the genetic variation in humans and other organisms. In many occasions, either by design or by necessity, the sequencing procedure is performed on a pool of DNA samples with different abundances, where the abundance of each sample is unknown. Such a scenario is naturally occurring in the case of metagenomics analysis where a pool of bacteria is sequenced, or in the case of population studies involving DNA pools by design. Particularly, various pooling designs were recently suggested that can identify carriers of rare alleles in large cohorts, dramatically reducing the cost of such large-scale sequencing projects. A fundamental problem with such approaches for population studies is that the uncertainty of DNA proportions from different individuals in the pools might lead to spurious associations. Fortunately, it is often the case that the genotype data of at least some of the individuals in the pool is known. Here, we propose a method (eALPS) that uses the genotype data in conjunction with the pooled sequence data in order to accurately estimate the proportions of the samples in the pool, even in cases where not all individuals in the pool were genotyped (eALPS-LD). Using real data from a sequencing pooling study of non-Hodgkin's lymphoma, we demonstrate that the estimation of the proportions is crucial, since otherwise there is a risk for false discoveries. Additionally, we demonstrate that our approach is also applicable to the problem of quantification of species in metagenomics samples (eALPS-BCR) and is particularly suitable for metagenomic quantification of closely related species.

Resumo Limpo

recent advanc highthroughput sequenc technolog bring potenti better character genet variat human organ mani occas either design necess sequenc procedur perform pool dna sampl differ abund abund sampl unknown scenario natur occur case metagenom analysi pool bacteria sequenc case popul studi involv dna pool design particular various pool design recent suggest can identifi carrier rare allel larg cohort dramat reduc cost largescal sequenc project fundament problem approach popul studi uncertainti dna proport differ individu pool might lead spurious associ fortun often case genotyp data least individu pool known propos method ealp use genotyp data conjunct pool sequenc data order accur estim proport sampl pool even case individu pool genotyp ealpsld use real data sequenc pool studi nonhodgkin lymphoma demonstr estim proport crucial sinc otherwis risk fals discoveri addit demonstr approach also applic problem quantif speci metagenom sampl ealpsbcr particular suitabl metagenom quantif close relat speci

Resumos Similares

J Biomed Inform - Bayesian evolutionary hypergraph learning for predicting cancer clinical outcomes. ( 0,776138640817142 )
Curr Protoc Bioinformatics - BEDTools: The Swiss-Army Tool for Genome Feature Analysis. ( 0,684275788466236 )
Brief. Bioinformatics - Rapid innovation in ChIP-seq peak-calling algorithms is outdistancing benchmarking efforts. ( 0,678637497073406 )
Brief. Bioinformatics - Kernel methods for large-scale genomic data analysis. ( 0,663177513115811 )
Brief. Bioinformatics - Experimental evidence validating the computational inference of functional associations from gene fusion events: a critical survey. ( 0,660409146372771 )
Comput Biol Chem - In silico identification of conserved microRNAs and their target transcripts from expressed sequence tags of three earthworm species. ( 0,658044081311785 )
Comput Math Methods Med - Multiple suboptimal solutions for prediction rules in gene expression data. ( 0,657235868099358 )
Artif Intell Med - Detecting disease genes based on semi-supervised learning and protein-protein interaction networks. ( 0,644465420849452 )
J. Comput. Biol. - A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data. ( 0,644142729586207 )
J Integr Bioinform - BacillusRegNet: a transcriptional regulation database and analysis platform for Bacillus species. ( 0,640400533637742 )
J Integr Bioinform - Identification of common carp innate immune genes with whole-genome sequencing and RNA-Seq data. ( 0,635699519656248 )
Med Biol Eng Comput - A method for detecting significant genomic regions associated with oral squamous cell carcinoma using aCGH. ( 0,629098084292773 )
BMC Med Inform Decis Mak - Improved method for protein complex detection using bottleneck proteins. ( 0,629067799120224 )
J. Comput. Biol. - Learning cellular sorting pathways using protein interactions and sequence motifs. ( 0,626652908809123 )
Comput. Biol. Med. - Gene expression data classification using locally linear discriminant embedding. ( 0,62549851862825 )
J. Comput. Biol. - Exploiting genome structure in association analysis. ( 0,625044752741433 )
Brief. Bioinformatics - Pattern recognition and probabilistic measures in alignment-free sequence analysis. ( 0,622597551577171 )
Brief. Bioinformatics - The genomic and functional characteristics of disease genes. ( 0,62072475467835 )
Wiley Interdiscip Rev Syst Biol Med - Postgenomic technologies targeting the Wnt signaling network. ( 0,619386463347275 )
Brief. Bioinformatics - Rich annotation of DNA sequencing variants by leveraging the Ensembl Variant Effect Predictor with plugins. ( 0,619287263723621 )
Comput Biol Chem - Predicting protein-RNA interaction amino acids using random forest based on submodularity subset selection. ( 0,618671356912982 )
Comput Biol Chem - Gene expression regulation of the PF00480 or PF14340 domain proteins suggests their involvement in sulfur metabolism. ( 0,618088734360719 )
Wiley Interdiscip Rev Syst Biol Med - Mass spectrometry-based proteomics: qualitative identification to activity-based protein profiling. ( 0,616827854915621 )
Brief. Bioinformatics - Benchmarking of viral haplotype reconstruction programmes: an overview of the capacities and limitations of currently available programmes. ( 0,610618178523853 )
Comput Biol Chem - Global expression analysis of miRNA gene cluster and family based on isomiRs from deep sequencing data. ( 0,605229272825969 )
Comput Biol Chem - Menzerath-Altmann law in mammalian exons reflects the dynamics of gene structure evolution. ( 0,601880944017427 )
Comput. Biol. Med. - A review on the computational approaches for gene regulatory network construction. ( 0,601670376007729 )
J. Comput. Biol. - NP-MuScL: unsupervised global prediction of interaction networks from multiple data sources. ( 0,60138255187744 )
J. Comput. Biol. - The irredundant class method for remote homology detection of protein sequences. ( 0,598969514820866 )
IEEE Trans Image Process - Multiview Hessian regularization for image annotation. ( 0,598650945711637 )
J Biomed Inform - A machine-learned knowledge discovery method for associating complex phenotypes with complex genotypes. Application to pain. ( 0,598279037635431 )
J. Comput. Biol. - Describing the complexity of systems: multivariable set complexity and the information basis of systems biology. ( 0,59217971547077 )
J. Comput. Biol. - Calculating sample size estimates for RNA sequencing data. ( 0,591516143229507 )
Brief. Bioinformatics - An open-pollinated design for mapping imprinting genes in natural populations. ( 0,591512281289422 )
Wiley Interdiscip Rev Syst Biol Med - Layers of epistasis: genome-wide regulatory networks and network approaches to genome-wide association studies. ( 0,591452433326501 )
J. Comput. Biol. - Determining a singleton attractor of a boolean network with nested canalyzing functions. ( 0,591450573979452 )
Wiley Interdiscip Rev Syst Biol Med - Using a systems biology approach to understand and study the mechanisms of metastasis. ( 0,589937852564767 )
Brief. Bioinformatics - Lessons from a decade of integrating cancer copy number alterations with gene expression profiles. ( 0,589354155162319 )
Brief. Bioinformatics - Causes, consequences and solutions of phylogenetic incongruence. ( 0,589085150376348 )
Brief. Bioinformatics - Application of second-generation sequencing to cancer genomics. ( 0,587401156075604 )
Wiley Interdiscip Rev Syst Biol Med - Genome network medicine: innovation to overcome huge challenges in cancer therapy. ( 0,585750475535181 )
Comput Biol Chem - Structural characteristics of genomic islands associated with GMP synthases as integration hotspot among sequenced microbial genomes. ( 0,585581411687649 )
Artif Intell Med - Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data. ( 0,584628567877758 )
Comput Methods Programs Biomed - Discriminating protein structure classes by incorporating Pseudo Average Chemical Shift to Chou's general PseAAC and Support Vector Machine. ( 0,583466774437268 )
J Integr Bioinform - Construction of coffee transcriptome networks based on gene annotation semantics. ( 0,581842418153535 )
Comput Biol Chem - Computational intelligence techniques in bioinformatics. ( 0,581744345969959 )
Methods Inf Med - Pathway based microarray analysis, utilising enzyme compounds and cascade events. ( 0,580187741414659 )
J Biomed Inform - The inference of breast cancer metastasis through gene regulatory networks. ( 0,577675095554777 )
Comput Biol Chem - Large replication skew domains delimit GC-poor gene deserts in human. ( 0,577600539333455 )
Artif Intell Med - Predicting malaria interactome classifications from time-course transcriptomic data along the intraerythrocytic developmental cycle. ( 0,577588089560957 )
J Integr Bioinform - Efficient online transcription factor binding site adjustment by integrating transitive graph projection with MoRAine 2.0. ( 0,577352879478727 )
J Biomed Inform - SAGA: a hybrid search algorithm for Bayesian Network structure learning of transcriptional regulatory networks. ( 0,576088736572667 )
J Integr Bioinform - The topology of the growing human interactome data. ( 0,575993515721684 )
J Am Med Inform Assoc - Utility of gene-specific algorithms for predicting pathogenicity of uncertain gene variants. ( 0,575361404894201 )
Comput. Biol. Med. - Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors. ( 0,574726360696075 )
Comput Biol Chem - GPEC: a Cytoscape plug-in for random walk-based gene prioritization and biomedical evidence collection. ( 0,573868722824283 )
J. Comput. Biol. - Prediction of rare single-nucleotide causative mutations for muscular diseases in pooled next-generation sequencing experiments. ( 0,573629204823911 )
J Am Med Inform Assoc - Network models of genome-wide association studies uncover the topological centrality of protein interactions in complex diseases. ( 0,573408231794841 )
J Integr Bioinform - Analysis and construction of pathogenicity island regulatory pathways in Salmonella enterica serovar Typhi. ( 0,573339684713991 )
J Biomed Inform - Hemojuvelin-hepcidin axis modeled and analyzed using Petri nets. ( 0,572703963073084 )
Comput Biol Chem - lncRNAMap: a map of putative regulatory functions in the long non-coding transcriptome. ( 0,572173173465443 )
J Integr Bioinform - Probabilistic latent semantic analysis applied to whole bacterial genomes identifies common genomic features. ( 0,571668061522219 )
Comput Biol Chem - Exploring the complexity of pathway-drug relationships using latent Dirichlet allocation. ( 0,571589272353202 )
J Biomed Inform - The detection of risk pathways, regulated by miRNAs, via the integration of sample-matched miRNA-mRNA profiles and pathway structure. ( 0,571020976786553 )
J. Comput. Biol. - Reconstructing Boolean models of signaling. ( 0,570944624269435 )
IEEE Trans Pattern Anal Mach Intell - The Effect of Model Misspecification on Semi-Supervised Classification. ( 0,567372095773442 )
Comput Biol Chem - Classification of splice-junction sequences via weighted position specific scoring approach. ( 0,566599360332528 )
J Biomed Inform - Comparative analysis of a novel disease phenotype network based on clinical manifestations. ( 0,565905022563657 )
Comput Biol Chem - Support vector machine with a Pearson VII function kernel for discriminating halophilic and non-halophilic proteins. ( 0,564573881506963 )
Comput Biol Chem - In silico characterization and evolutionary analyses of CCAAT binding proteins in the lycophyte plant Selaginella moellendorffii genome: a growing comparative genomics resource. ( 0,564548705227015 )
J Integr Bioinform - A study of the short and long-term regulation of E. coli metabolic pathways. ( 0,563916920552814 )
Comput. Biol. Med. - Impact of TGF-b on breast cancer from a quantitative proteomic analysis. ( 0,563893411663237 )
Brief. Bioinformatics - Next generation sequencing in functional genomics. ( 0,562763108263253 )
Int J Neural Syst - Unorganized machines for seasonal streamflow series forecasting. ( 0,562711603362897 )
Wiley Interdiscip Rev Syst Biol Med - The zebrafish: scalable in vivo modeling for systems biology. ( 0,56268541708003 )
J Integr Bioinform - Profiling of genetic switches using boolean implications in expression data. ( 0,561301236170963 )
J. Comput. Biol. - Integration of 198 ChIP-seq datasets reveals human cis-regulatory regions. ( 0,561067376878713 )
Brief. Bioinformatics - Evolution of gene regulation--on the road towards computational inferences. ( 0,56098960021227 )
Methods Inf Med - Probability machines: consistent probability estimation using nonparametric learning machines. ( 0,560690197777825 )
Wiley Interdiscip Rev Syst Biol Med - Establishing the stem cell state: insights from regulatory network analysis of blood stem cell development. ( 0,559506879003008 )
J Integr Bioinform - Network expansion and pathway enrichment analysis towards biologically significant findings from microarrays. ( 0,558625172078202 )
J. Comput. Biol. - AREM: aligning short reads from ChIP-sequencing by expectation maximization. ( 0,558259769959851 )
Wiley Interdiscip Rev Syst Biol Med - Noncoding RNAs in gene regulation. ( 0,557783783908097 )
Comput Biol Chem - Using gene expression programming to infer gene regulatory networks from time-series data. ( 0,557761403185239 )
Comput Biol Chem - Analysis of the relationships between evolvability, thermodynamics, and the functions of intrinsically disordered proteins/regions. ( 0,557357195513131 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,556791219189772 )
Wiley Interdiscip Rev Syst Biol Med - Mediators and dynamics of DNA methylation. ( 0,556708120365435 )
J. Comput. Biol. - Computing the probability of RNA hairpin and multiloop formation. ( 0,555439283343849 )
Wiley Interdiscip Rev Syst Biol Med - Genome-wide approaches in the study of microRNA biology. ( 0,555078071738609 )
Comput Biol Chem - Molecular phylogenetic study and expression analysis of ATP-binding cassette transporter gene family in Oryza sativa in response to salt stress. ( 0,555027936399304 )
J. Comput. Biol. - Biological network querying techniques: analysis and comparison. ( 0,554660208959238 )
Brief. Bioinformatics - Positional orthology: putting genomic evolutionary relationships into context. ( 0,554209999425615 )
Brief. Bioinformatics - Semiparametric prognosis models in genomic studies. ( 0,553284592861855 )
Comput. Biol. Med. - Support vector machine algorithms in the search of KIR gene associations with disease. ( 0,552456414373621 )
J. Comput. Biol. - Markov logic networks in the analysis of genetic data. ( 0,551021660708687 )
Comput. Biol. Med. - Mathematical modeling and sensitivity analysis of the integrated TNFa-mediated apoptotic pathway for identifying key regulators. ( 0,550918301200932 )
Wiley Interdiscip Rev Syst Biol Med - Recent advances in prostate development and links to prostatic diseases. ( 0,550673294658441 )
Comput Biol Chem - Identification of novel splice variants and exons of human endothelial cell-specific chemotaxic regulator (ECSCR) by bioinformatics analysis. ( 0,550640974413556 )
Wiley Interdiscip Rev Syst Biol Med - Signaling networks in palate development. ( 0,549785266974399 )
IEEE Trans Pattern Anal Mach Intell - Scaling Multidimensional Inference for Structured Gaussian Processes. ( 0,549012061023348 )