J. Comput. Biol. - Inconsistent Denoising and Clustering Algorithms for Amplicon Sequence Data.

Tópicos

{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ sequenc(1873) structur(1644) protein(1328) }
{ model(2656) set(1616) predict(1553) }
{ health(1844) social(1437) communiti(874) }
{ imag(2830) propos(1344) filter(1198) }
{ general(901) number(790) one(736) }
{ model(2341) predict(2261) use(1141) }
{ group(2977) signific(1463) compar(1072) }
{ studi(2440) review(1878) systemat(933) }
{ data(1714) softwar(1251) tool(1186) }
{ import(1318) role(1303) understand(862) }
{ use(2086) technolog(871) perceiv(783) }
{ can(774) often(719) complex(702) }
{ framework(1458) process(801) describ(734) }
{ featur(1941) imag(1645) propos(1176) }
{ data(2317) use(1299) case(1017) }
{ cost(1906) reduc(1198) effect(832) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1057) registr(996) error(939) }
{ patient(2315) diseas(1263) diabet(1191) }
{ assess(1506) score(1403) qualiti(1306) }
{ learn(2355) train(1041) set(1003) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ control(1307) perform(991) simul(935) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ perform(1367) use(1326) method(1137) }
{ model(3480) simul(1196) paramet(876) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ gene(2352) biolog(1181) express(1162) }
{ analysi(2126) use(1163) compon(1037) }
{ decis(3086) make(1611) patient(1517) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ method(1219) similar(1157) match(930) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ take(945) account(800) differ(722) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ can(981) present(881) function(850) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Natural microbial communities have been studied for decades using the 16S rRNA gene as a marker. In recent years, the application of second-generation sequencing technologies has revolutionized our understanding of the structure and function of microbial communities in complex environments. Using these highly parallel techniques, a detailed description of community characteristics are constructed, and even the rare biosphere can be detected. The new approaches carry numerous advantages and lack many features that skewed the results using traditional techniques, but we are still facing serious bias, and the lack of reliable comparability of produced results. Here, we contrasted publicly available amplicon sequence data analysis algorithms by using two different data sets, one with defined clone-based structure, and one with food spoilage community with well-studied communities. We aimed to assess which software and parameters produce results that resemble the benchmark community best, how large differences can be detected between methods, and whether these differences are statistically significant. The results suggest that commonly accepted denoising and clustering methods used in different combinations produce significantly different outcome: clustering method impacts greatly on the number of operational taxonomic units (OTUs) and denoising algorithm influences more on taxonomic affiliations. The magnitude of the OTU number difference was up to 40-fold and the disparity between results seemed highly dependent on the community structure and diversity. Statistically significant differences in taxonomies between methods were seen even at phylum level. However, the application of effective denoising method seemed to even out the differences produced by clustering.

Resumo Limpo

natur microbi communiti studi decad use s rrna gene marker recent year applic secondgener sequenc technolog revolution understand structur function microbi communiti complex environ use high parallel techniqu detail descript communiti characterist construct even rare biospher can detect new approach carri numer advantag lack mani featur skew result use tradit techniqu still face serious bias lack reliabl compar produc result contrast public avail amplicon sequenc data analysi algorithm use two differ data set one defin clonebas structur one food spoilag communiti wellstudi communiti aim assess softwar paramet produc result resembl benchmark communiti best larg differ can detect method whether differ statist signific result suggest common accept denois cluster method use differ combin produc signific differ outcom cluster method impact great number oper taxonom unit otus denois algorithm influenc taxonom affili magnitud otu number differ fold dispar result seem high depend communiti structur divers statist signific differ taxonomi method seen even phylum level howev applic effect denois method seem even differ produc cluster

Resumos Similares

AMIA Annu Symp Proc - Using hierarchical mixture of experts model for fusion of outbreak detection methods. ( 0,663342450887841 )
J. Comput. Biol. - EDAR: an efficient error detection and removal algorithm for next generation sequencing data. ( 0,656624554706595 )
Int J Health Geogr - Detecting activity locations from raw GPS data: a novel kernel-based algorithm. ( 0,653334283912083 )
IEEE Trans Pattern Anal Mach Intell - A Link-Based Approach to the Cluster Ensemble Problem. ( 0,648192652514434 )
Comput Math Methods Med - A wavelet relational fuzzy C-means algorithm for 2D gel image segmentation. ( 0,644151079788606 )
Int J Health Geogr - A binary-based approach for detecting irregularly shaped clusters. ( 0,642830739327454 )
J Biomed Inform - Quantifying the determinants of outbreak detection performance through simulation and machine learning. ( 0,635290115542788 )
Comput Methods Programs Biomed - Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices. ( 0,631057445927863 )
IEEE Trans Image Process - Maximum a posteriori video super-resolution using a new multichannel image prior. ( 0,629912552711175 )
Spat Spatiotemporal Epidemiol - Optimal selection of the spatial scan parameters for cluster detection: a simulation study. ( 0,623429874094182 )
IEEE Trans Vis Comput Graph - GPU-based Multilevel Clustering. ( 0,61151649333249 )
Comput Methods Programs Biomed - Fuzzy and hard clustering analysis for thyroid disease. ( 0,611390661175121 )
Int J Health Geogr - Detection of arbitrarily-shaped clusters using a neighbor-expanding approach: a case study on murine typhus in south Texas. ( 0,609541427240589 )
J Chem Inf Model - Metabolism site prediction based on xenobiotic structural formulas and PASS prediction algorithm. ( 0,608199142244392 )
J Chem Inf Model - String kernels and high-quality data set for improved prediction of kinked helices in a-helical membrane proteins. ( 0,604547555019941 )
J Integr Bioinform - An evolutionary and visual framework for clustering of DNA microarray data. ( 0,603280550892296 )
J Integr Bioinform - Parallel Niche Pareto AlineaGA--an evolutionary multiobjective approach on multiple sequence alignment. ( 0,597955558623949 )
Brief. Bioinformatics - Data construction for phosphorylation site prediction. ( 0,597543969364885 )
IEEE Trans Neural Netw Learn Syst - Improved Fault Classification in Series Compensated Transmission Line: Comparative Evaluation of Chebyshev Neural Network Training Algorithms. ( 0,596743350810306 )
Int J Health Geogr - Detection of clusters of a rare disease over a large territory: performance of cluster detection methods. ( 0,594196645128075 )
J Biomed Inform - A kinetic model-based algorithm to classify NGS short reads by their allele origin. ( 0,588838574220767 )
Med Decis Making - Developing appropriate methods for cost-effectiveness analysis of cluster randomized trials. ( 0,58711114613735 )
Comput Math Methods Med - Decimative spectral estimation with unconstrained model order. ( 0,583423328233161 )
Brief. Bioinformatics - A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. ( 0,581180379439569 )
IEEE Trans Image Process - Performance comparisons of contour-based corner detectors. ( 0,572265729714594 )
IEEE Trans Image Process - Efficient hybrid tree-based stereo matching with applications to postcapture image refocusing. ( 0,57190403020381 )
Comput. Biol. Med. - Finding multivariate outliers in fMRI time-series data. ( 0,564571398388409 )
J Chem Inf Model - Comparison of combinatorial clustering methods on pharmacological data sets represented by machine learning-selected real molecular descriptors. ( 0,563043451065185 )
Comput Methods Programs Biomed - OLYMPUS: an automated hybrid clustering method in time series gene expression. Case study: host response after Influenza A (H1N1) infection. ( 0,562809744915198 )
J. Med. Internet Res. - Security analysis and improvements to the PsychoPass method. ( 0,559922514875273 )
Comput Methods Programs Biomed - Generalized rough fuzzy c-means algorithm for brain MR image segmentation. ( 0,555835750101164 )
Med Decis Making - Multiple imputation methods for handling missing data in cost-effectiveness analyses that use data from hierarchical studies: an application to cluster randomized trials. ( 0,555273085643836 )
Comput. Biol. Med. - A methodology to identify consensus classes from clustering algorithms applied to immunohistochemical data from breast cancer patients. ( 0,554505556018749 )
IEEE Trans Image Process - Multiscale semilocal interpolation with antialiasing. ( 0,552466532564333 )
J Am Med Inform Assoc - Privacy-preserving heterogeneous health data sharing. ( 0,549538532696574 )
IEEE Trans Image Process - Edge detecting for range data using Laplacian operators. ( 0,548019404813367 )
Spat Spatiotemporal Epidemiol - Performance of cancer cluster Q-statistics for case-control residential histories. ( 0,547905437958989 )
IEEE Trans Pattern Anal Mach Intell - Semi-Supervised Kernel Mean Shift Clustering. ( 0,547113002500525 )
Brief. Bioinformatics - Review of tandem repeat search tools: a systematic approach to evaluating algorithmic performance. ( 0,546971197708981 )
IEEE Trans Image Process - Robust reversible watermarking via clustering and enhanced pixel-wise masking. ( 0,543181494597063 )
Comput Math Methods Med - A robust rerank approach for feature selection and its application to pooling-based GWA studies. ( 0,539583219036118 )
Artif Intell Med - Missing data imputation using statistical and machine learning methods in a real breast cancer problem. ( 0,535699134561206 )
Neural Comput - Spontaneous clustering via minimum -divergence. ( 0,535664793296155 )
J Integr Bioinform - Clustering of gene expression profiles: creating initialization-independent clusterings by eliminating unstable genes. ( 0,534080561908195 )
J. Comput. Biol. - A geometric clustering algorithm with applications to structural data. ( 0,533742078765151 )
Comput Math Methods Med - Identification of DNA-binding proteins using support vector machine with sequence information. ( 0,532526247176098 )
Brief. Bioinformatics - Ultrafast clustering algorithms for metagenomic sequence analysis. ( 0,528560018082913 )
J Chem Inf Model - Investigation of the use of spectral clustering for the analysis of molecular data. ( 0,527867684852443 )
IEEE Trans Image Process - A comparative review of component tree computation algorithms. ( 0,526417896752252 )
Med Decis Making - Cost-saving tree-structured survival analysis for hip fracture of study of osteoporotic fractures data. ( 0,525589096802193 )
J Chem Inf Model - Cavities tell more than sequences: exploring functional relationships of proteases via binding pockets. ( 0,525135571158088 )
J Biomed Inform - Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values. ( 0,523693797822827 )
J Biomed Inform - A semantic framework to protect the privacy of electronic health records with non-numerical attributes. ( 0,523119072959389 )
Neural Comput - A nonparametric clustering algorithm with a quantile-based likelihood estimator. ( 0,522528758460238 )
Brief. Bioinformatics - Detecting miRNAs in deep-sequencing data: a software performance comparison and evaluation. ( 0,522073667931146 )
AMIA Annu Symp Proc - Patient clustering with uncoded text in electronic medical records. ( 0,521138990253347 )
Int J Health Geogr - Using statistical methods and genotyping to detect tuberculosis outbreaks. ( 0,520850324408741 )
BMC Med Inform Decis Mak - Efficient algorithms for fast integration on large data sets from multiple sources. ( 0,520447943449704 )
Comput. Biol. Med. - Combined prediction of transmembrane topology and signal peptide of beta-barrel proteins: using a hidden Markov model and genetic algorithms. ( 0,520255408035603 )
Artif Intell Med - Weighted spherical 1-mean with phase shift and its application in electrocardiogram discord detection. ( 0,519886994546972 )
Comput Methods Programs Biomed - A new approach based on Machine Learning for predicting corneal curvature (K1) and astigmatism in patients with keratoconus after intracorneal ring implantation. ( 0,519849010344736 )
J Med Syst - Application of attribute weighting method based on clustering centers to discrimination of linearly non-separable medical datasets. ( 0,518789127217532 )
IEEE Trans Pattern Anal Mach Intell - Multi-Exemplar Affinity Propagation. ( 0,518170584398603 )
Res Synth Methods - Less is less: a systematic review of graph use in meta-analyses. ( 0,518122260280876 )
J Chem Inf Model - Consensus methods for combining multiple clusterings of chemical structures. ( 0,517557892180267 )
J Med Syst - Improved fuzzy clustering algorithms in segmentation of DC-enhanced breast MRI. ( 0,513491131971284 )
Health Info Libr J - A bibliometric approach demonstrates the impact of a social care data set on research and policy. ( 0,510657365082928 )
J. Comput. Biol. - Biological cluster evaluation for gene function prediction. ( 0,510350533076317 )
Comput Math Methods Med - Liver segmentation based on Snakes Model and improved GrowCut algorithm in abdominal CT image. ( 0,50978533577506 )
Brief. Bioinformatics - Accounting for noise when clustering biological data. ( 0,508811678402502 )
AMIA Annu Symp Proc - Survival prediction and treatment recommendation with Bayesian techniques in lung cancer. ( 0,508657255040262 )
Brief. Bioinformatics - Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data. ( 0,505815360387327 )
Comput. Biol. Med. - Evaluation of automatic feature detection algorithms in EEG: application to interburst intervals. ( 0,505125350335597 )
Comput Math Methods Med - A time-domain hybrid analysis method for detecting and quantifying T-wave alternans. ( 0,504655755814206 )
IEEE J Biomed Health Inform - Optimization of heartbeat detection in fiber-optic unobtrusive measurements by using maximum a posteriori probability estimation. ( 0,502242370128072 )
Int J Comput Assist Radiol Surg - CT dataset anisotropy management for oral implantology planning software. ( 0,500454901874663 )
Int J Comput Assist Radiol Surg - Preclinical feasibility of a technology framework for MRI-guided iliac angioplasty. ( 0,49836644909161 )
Sci Data - Cross-platform ultradeep transcriptomic profiling of human reference RNA samples by RNA-Seq. ( 0,498194992601709 )
Comput Methods Programs Biomed - Wrapper feature selection for small sample size data driven by complete error estimates. ( 0,497400832470155 )
Int J Neural Syst - A genetic graph-based approach for partitional clustering. ( 0,495308476369474 )
Lifetime Data Anal - Efficiency improvement in a class of survival models through model-free covariate incorporation. ( 0,494789786443484 )
Res Synth Methods - Assessing baseline imbalance in randomised trials: implications for the Cochrane risk of bias tool. ( 0,493364159563664 )
J. Comput. Biol. - Enhancing Gibbs sampling method for motif finding in DNA with initial graph representation of sequences. ( 0,492361180364032 )
Artif Intell Med - Vicinal support vector classifier using supervised kernel-based clustering. ( 0,49230864702858 )
Neural Comput - Feature selection for ordinal text classification. ( 0,491490367826386 )
Comput Biol Chem - piClust: a density based piRNA clustering algorithm. ( 0,490664210137261 )
J Biomed Inform - Clustering clinical models from local electronic health records based on semantic similarity. ( 0,488995381864635 )
IEEE J Biomed Health Inform - Red blood cell cluster separation from digital images for use in sickle cell disease. ( 0,487514074737407 )
J Chem Inf Model - Hydride dissociation energies of six-membered heterocyclic organic hydrides predicted by ONIOM-G4Method. ( 0,487455158820726 )
Neural Comput - System identification of mGluR-dependent long-term depression. ( 0,487172671426466 )
IEEE Trans Image Process - Entropy-functional-based online adaptive decision fusion framework with application to wildfire detection in video. ( 0,485430648697188 )
Int J Neural Syst - Adaptive k-means algorithm for overlapped graph clustering. ( 0,485365089893202 )
Med Biol Eng Comput - Detection of swallows with silent aspiration using swallowing and breath sound analysis. ( 0,485081261734025 )
Int J Health Geogr - Interactive web-based mapping: bridging technology and data for health. ( 0,484601128129978 )
Comput Methods Programs Biomed - Multiscaled combination of MR and SPECT images in neuroimaging: A simplex method based variable-weight fusion. ( 0,484409741143895 )
Comput Math Methods Med - Novel harmonic regularization approach for variable selection in Cox's proportional hazards model. ( 0,483797134941648 )
Comput Methods Programs Biomed - Dynamic, location-based channel selection for power consumption reduction in EEG analysis. ( 0,483604751828809 )
Int J Comput Assist Radiol Surg - Fast lung nodule detection in chest CT images using cylindrical nodule-enhancement filter. ( 0,483093996548428 )
J Chem Inf Model - Algorithm for reaction classification. ( 0,482714908406549 )
IEEE Trans Image Process - Fractal dimension of color fractal images. ( 0,481662594098727 )