J. Comput. Biol. - Characterizing the empirical distribution of prokaryotic genome n-mers in the presence of nullomers.

Tópicos

{ model(3404) distribut(989) bayesian(671) }
{ gene(2352) biolog(1181) express(1162) }
{ can(774) often(719) complex(702) }
{ studi(1119) effect(1106) posit(819) }
{ research(1085) discuss(1038) issu(1018) }
{ activ(1138) subject(705) human(624) }
{ signal(2180) analysi(812) frequenc(800) }
{ method(1219) similar(1157) match(930) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ age(1611) year(1155) adult(843) }
{ time(1939) patient(1703) rate(768) }
{ cancer(2502) breast(956) screen(824) }
{ measur(2081) correl(1212) valu(896) }
{ motion(1329) object(1292) video(1091) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ control(1307) perform(991) simul(935) }
{ method(984) reconstruct(947) comput(926) }
{ howev(809) still(633) remain(590) }
{ monitor(1329) mobil(1314) devic(1160) }
{ state(1844) use(1261) util(961) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ use(1733) differ(960) four(931) }
{ estim(2440) model(1874) function(577) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ medic(1828) order(1363) alert(1069) }
{ cost(1906) reduc(1198) effect(832) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ use(976) code(926) identifi(902) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Characterizing the empirical distribution of the frequency of n-mers is a vital step in understanding the entire genome. This will allow for researchers to examine how complex the genome really is, and move beyond simple, traditional modeling frameworks that are often biased in the presence of abundant and/or extremely rare words. We hypothesize that models based on the negative binomial distribution and its zero-inflated counterpart will characterize the n-mer distributions of genomes better than the Poisson. Our study examined the empirical distribution of the frequency of n-mers (6=n=11) in 2,199 genomes. We considered four distributions: Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial (ZINB). The number of genomes that have nullomers in 6-, 7-, and 8-mers was 150, 602 and 2,012, respectively, whereas all of the genomes for the 9-, 10-, and 11-mers had nullomers. In each n-mer considered, the negative binomial model performed the best for at least 93% of the 2,199 genomes; however, a small percentage (i.e., <7%) of the genomes did prefer the ZINB. The negative binomial and zero-inflation distributions extend the traditional Poisson setting and are more flexible in handling overdispersion that can be caused by an increase in nullomers. In an effort to characterize the distribution of the frequency of n-mers, researchers should also consider other discrete distributions that are more flexible and adjust for possible overdispersion.

Resumo Limpo

character empir distribut frequenc nmer vital step understand entir genom will allow research examin complex genom realli move beyond simpl tradit model framework often bias presenc abund andor extrem rare word hypothes model base negat binomi distribut zeroinfl counterpart will character nmer distribut genom better poisson studi examin empir distribut frequenc nmer n genom consid four distribut poisson negat binomi zeroinfl poisson zeroinfl negat binomi zinb number genom nullom mer respect wherea genom mer nullom nmer consid negat binomi model perform best least genom howev small percentag ie genom prefer zinb negat binomi zeroinfl distribut extend tradit poisson set flexibl handl overdispers can caus increas nullom effort character distribut frequenc nmer research also consid discret distribut flexibl adjust possibl overdispers

Resumos Similares

J. Comput. Biol. - Expectation-maximization algorithm for determining natural selection of Y-linked genes through two-sex branching processes. ( 0,752412579182688 )
Neural Comput - A semiparametric Bayesian model for detecting synchrony among multiple neurons. ( 0,752049669498617 )
Res Synth Methods - A Bayesian nonparametric meta-analysis model. ( 0,743042849407985 )
Brief. Bioinformatics - The dilemma of choosing the ideal permutation strategy while estimating statistical significance of genome-wide enrichment. ( 0,736561216478306 )
Res Synth Methods - Critical interpretation of Cochran's Q test depends on power and prior assumptions about heterogeneity. ( 0,734801253962581 )
IEEE Trans Pattern Anal Mach Intell - Causal Inference on Discrete Data using Additive Noise Models. ( 0,730672553763587 )
Comput Math Methods Med - Inference for ecological dynamical systems: a case study of two endemic diseases. ( 0,730058254313059 )
IEEE Trans Pattern Anal Mach Intell - Modeling Natural Images Using Gated MRFs. ( 0,72402816966527 )
Comput Math Methods Med - An empirical Bayes optimal discovery procedure based on semiparametric hierarchical mixture models. ( 0,719491946536906 )
Spat Spatiotemporal Epidemiol - Goodness-of-fit measures for individual-level models of infectious disease in a Bayesian framework. ( 0,715641541375422 )
IEEE Trans Pattern Anal Mach Intell - Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process? ( 0,713130359337421 )
Med Decis Making - Bayesian calibration of a natural history model with application to a population model for colorectal cancer. ( 0,709418233528109 )
Lifetime Data Anal - Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis. ( 0,707734186693894 )
IEEE Trans Image Process - Computationally tractable stochastic image modeling based on symmetric Markov mesh random fields. ( 0,703088429624682 )
Comput. Biol. Med. - Expectation-maximization technique for fibro-glandular discs detection in mammography images. ( 0,702277893762341 )
Lifetime Data Anal - Bayesian nonparametric models for ranked set sampling. ( 0,699730952832827 )
Med Decis Making - Calibration of complex models through Bayesian evidence synthesis: a demonstration and tutorial. ( 0,693046060663111 )
Brief. Bioinformatics - Semiparametric prognosis models in genomic studies. ( 0,689471139263954 )
Artif Intell Med - On the interplay of machine learning and background knowledge in image interpretation by Bayesian networks. ( 0,688500259332514 )
J. Comput. Biol. - Computational methods for a class of network models. ( 0,686540692858299 )
Spat Spatiotemporal Epidemiol - Bayesian hierarchical modeling of the dynamics of spatio-temporal influenza season outbreaks. ( 0,685331442608963 )
IEEE Trans Image Process - Probabilistic image modeling with an extended chain graph for human activity recognition and image segmentation. ( 0,68175901244772 )
Brief. Bioinformatics - CaliBayes and BASIS: integrated tools for the calibration, simulation and storage of biological simulation models. ( 0,677723228876723 )
J. Comput. Biol. - An efficient data assimilation schema for restoration and extension of gene regulatory networks using time-course observation data. ( 0,677450154800904 )
Neural Comput - Impact of spike train autostructure on probability distribution of joint spike events. ( 0,674941626427482 )
J. Comput. Biol. - A spatial haplotype copying model with applications to genotype imputation. ( 0,674331984090531 )
Spat Spatiotemporal Epidemiol - A Bayesian space-time model for discrete spread processes on a lattice. ( 0,6735544240407 )
Comput Methods Programs Biomed - Multivariate Bayesian modeling of known and unknown causes of events--an application to biosurveillance. ( 0,670141788634211 )
IEEE Trans Image Process - Bayesian robust principal component analysis. ( 0,670033875867335 )
Comput Math Methods Med - Bayesian inference of the Weibull model based on interval-censored survival data. ( 0,668433959398248 )
IEEE Trans Image Process - Variational Bayesian method for Retinex. ( 0,662423546023689 )
Neural Comput - Bayesian sparse partial least squares. ( 0,660901139003884 )
Spat Spatiotemporal Epidemiol - Inference from ecological models: estimating the relative risk of stroke from air pollution exposure using small area data. ( 0,659833795936476 )
Comput Math Methods Med - Applications of Bayesian gene selection and classification with mixtures of generalized singular g-priors. ( 0,656180284548203 )
Comput Methods Programs Biomed - Identification of an integrated mathematical model of standard oral glucose tolerance test for characterization of insulin potentiation in health. ( 0,6536308508923 )
Lifetime Data Anal - Bayesian local influence for survival models. ( 0,653030630839776 )
Lifetime Data Anal - A new threshold regression model for survival data with a cure fraction. ( 0,650180737571746 )
Neural Comput - Efficient Markov chain Monte Carlo methods for decoding neural spike trains. ( 0,64979207473052 )
Comput Math Methods Med - Bayesian hierarchical modeling for categorical longitudinal data from sedation measurements. ( 0,649236640958011 )
J. Comput. Biol. - Bayesian blind source separation for data with network structure. ( 0,646856529040515 )
IEEE Trans Neural Netw Learn Syst - Variational Bayesian Inference Algorithms for Infinite Relational Model of Network Data. ( 0,641174829433619 )
IEEE Trans Neural Netw Learn Syst - Incorporating Wind Power Forecast Uncertainties Into Stochastic Unit Commitment Using Neural Network-Based Prediction Intervals. ( 0,639509462505696 )
IEEE Trans Pattern Anal Mach Intell - Negative Binomial Process Count and Mixture Modeling. ( 0,638378279417824 )
Med Decis Making - Not simply more of the same: distinguishing between patient heterogeneity and parameter uncertainty. ( 0,637598352223603 )
Comput Methods Programs Biomed - The exponentiated exponential mixture and non-mixture cure rate model in the presence of covariates. ( 0,636532355253333 )
Med Decis Making - Linear regression metamodeling as a tool to summarize and present simulation model results. ( 0,633614467214245 )
Lifetime Data Anal - Diagnostic tools for bivariate accelerated life regression models. ( 0,628790863097798 )
Med Biol Eng Comput - A poisson process model for hip fracture risk. ( 0,628640990116498 )
Res Synth Methods - Bayesian model selection for meta-analysis of diagnostic test accuracy data: Application to Ddimer for deep vein thrombosis. ( 0,625858978903979 )
IEEE Trans Pattern Anal Mach Intell - Temporal Analysis of Motif Mixtures using Dirichlet Processes. ( 0,625392617742802 )
AMIA Annu Symp Proc - Testing the calibration of classification models from first principles. ( 0,622973317745248 )
IEEE Trans Image Process - Bayesian estimation of linear mixtures using the normal compositional model. Application to hyperspectral imagery. ( 0,618157417097525 )
J Biomed Inform - Link-topic model for biomedical abbreviation disambiguation. ( 0,617042695655798 )
IEEE Trans Image Process - A Bayesian framework for image segmentation with spatially varying mixtures. ( 0,616589654411187 )
Res Synth Methods - Random-effects meta-analysis of time-to-event data using the expectation-maximisation algorithm and shrinkage estimators. ( 0,614488553726277 )
Neural Comput - Attention as reward-driven optimization of sensory processing. ( 0,611838071369298 )
Spat Spatiotemporal Epidemiol - Foot and mouth disease revisited: re-analysis using Bayesian spatial susceptible-infectious-removed models. ( 0,61063852752611 )
Med Decis Making - Assessing uncertainties surrounding combined endpoints for use in economic models. ( 0,60995375418814 )
Methods Inf Med - Evaluating strategies for marker ranking in genome-wide association studies of complex traits. ( 0,609098233396551 )
J. Comput. Biol. - NP-MuScL: unsupervised global prediction of interaction networks from multiple data sources. ( 0,6089680399815 )
J Integr Bioinform - Analyzing phylogenetic trees with timed and probabilistic model checking: the lactose persistence case study. ( 0,607007849167938 )
J. Comput. Biol. - Markov logic networks in the analysis of genetic data. ( 0,603722315143988 )
IEEE Trans Image Process - Generative Bayesian image super resolution with natural image prior. ( 0,603345251381935 )
J Biomed Inform - Hemojuvelin-hepcidin axis modeled and analyzed using Petri nets. ( 0,602852041135307 )
Comput Math Methods Med - A general framework for modeling sub- and ultraharmonics of ultrasound contrast agent signals with MISO volterra series. ( 0,601999375679694 )
Med Decis Making - Accounting for methodological, structural, and parameter uncertainty in decision-analytic models: a practical guide. ( 0,601673853500078 )
Spat Spatiotemporal Epidemiol - Mapping gender variation in the spatial pattern of alcohol-related mortality: a Bayesian analysis using data from South Yorkshire, United Kingdom. ( 0,59908243053685 )
IEEE J Biomed Health Inform - Sparsity-inspired nonparametric probability characterization for radio propagation in body area networks. ( 0,597788633514638 )
IEEE Trans Image Process - Studentized dynamical system for robust object tracking. ( 0,596279310791909 )
Brief. Bioinformatics - Validation of gene regulatory networks: scientific and inferential. ( 0,595355530953364 )
BMC Med Inform Decis Mak - A simulation model of colorectal cancer surveillance and recurrence. ( 0,594123179099236 )
AMIA Annu Symp Proc - Learning to predict post-hospitalization VTE risk from EHR data. ( 0,592427300478633 )
J. Comput. Biol. - Exploiting genome structure in association analysis. ( 0,591830798815447 )
Wiley Interdiscip Rev Syst Biol Med - Integrative modeling of the cardiac ventricular myocyte. ( 0,591687148207717 )
IEEE Trans Image Process - A study of multiplicative watermark detection in the contourlet domain using alpha-stable distributions. ( 0,590292448292599 )
Comput. Biol. Med. - A pattern-oriented specification of gene network inference processes. ( 0,587230075942293 )
Comput Methods Programs Biomed - NIMROD: a program for inference via a normal approximation of the posterior in models with random effects based on ordinary differential equations. ( 0,586358527347469 )
Med Decis Making - Comparing Bayesian and frequentist approaches for multiple outcome mixed treatment comparisons. ( 0,585842836680849 )
Res Synth Methods - A basic introduction to fixed-effect and random-effects models for meta-analysis. ( 0,584511662585298 )
Comput Methods Programs Biomed - On the prediction of glucose concentration under intra-patient variability in type 1 diabetes: a monotone systems approach. ( 0,583427375669895 )
Comput Math Methods Med - A generalized gamma mixture model for ultrasonic tissue characterization. ( 0,583288511408058 )
IEEE Trans Image Process - Blind image quality assessment: a natural scene statistics approach in the DCT domain. ( 0,583279354006808 )
Spat Spatiotemporal Epidemiol - Spatial correlation in Bayesian logistic regression with misclassification. ( 0,581605412777446 )
IEEE Trans Image Process - Bayesian inference of models and hyperparameters for robust optical-flow estimation. ( 0,580250136546059 )
Med Decis Making - A systematic comparison of microsimulation models of colorectal cancer: the role of assumptions about adenoma progression. ( 0,580200666502015 )
IEEE Trans Vis Comput Graph - The Perception of Visual Uncertainty Representation by Non-Experts. ( 0,579316316330032 )
IEEE Trans Image Process - Statistical modeling of 3-D natural scenes with application to Bayesian stereopsis. ( 0,57508020789512 )
J. Comput. Biol. - Identifying contributors of DNA mixtures by means of quantitative information of STR typing. ( 0,574169608800228 )
IEEE Trans Image Process - On random field Completely Automated Public Turing Test to Tell Computers and Humans Apart generation. ( 0,573598331506325 )
Neural Comput - Causal discovery via reproducing kernel Hilbert space embeddings. ( 0,573398467089585 )
IEEE Trans Image Process - Posterior-mean super-resolution with a causal Gaussian Markov random field prior. ( 0,572458652789965 )
Res Synth Methods - Automating network meta-analysis. ( 0,572177280329023 )
IEEE Trans Vis Comput Graph - Visualizing the Variability of Gradients in Uncertain 2D Scalar Fields. ( 0,57200524495464 )
Neural Comput - Learning coefficient of generalization error in Bayesian estimation and vandermonde matrix-type singularity. ( 0,570534029281157 )
J. Comput. Biol. - On the inference of dirichlet mixture priors for protein sequence comparison. ( 0,568157611274517 )
Comput Math Methods Med - A simulation study of the radiation-induced bystander effect: modeling with stochastically defined signal reemission. ( 0,568110957190684 )
Comput Methods Programs Biomed - A Bayesian multilevel model for fMRI data analysis. ( 0,567997051376876 )
IEEE Trans Image Process - Blind separation of time/position varying mixtures. ( 0,566923469325862 )
IEEE Trans Image Process - Wavelet variance analysis for random fields on a regular lattice. ( 0,563369291799054 )
IEEE Trans Pattern Anal Mach Intell - Articulated Human Detection with Flexible Mixtures-of-Parts. ( 0,562130199829692 )