Brief. Bioinformatics - A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis.

Tópicos

{ method(1969) cluster(1462) data(1082) }
{ clinic(1479) use(1117) guidelin(835) }
{ use(1733) differ(960) four(931) }
{ take(945) account(800) differ(722) }
{ method(1557) propos(1049) approach(1037) }
{ research(1085) discuss(1038) issu(1018) }
{ drug(1928) target(777) effect(648) }
{ data(1737) use(1416) pattern(1282) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ first(2504) two(1366) second(1323) }
{ measur(2081) correl(1212) valu(896) }
{ sequenc(1873) structur(1644) protein(1328) }
{ featur(1941) imag(1645) propos(1176) }
{ howev(809) still(633) remain(590) }
{ perform(999) metric(946) measur(919) }
{ spatial(1525) area(1432) region(1030) }
{ health(3367) inform(1360) care(1135) }
{ research(1218) medic(880) student(794) }
{ can(981) present(881) function(850) }
{ survey(1388) particip(1329) question(1065) }
{ can(774) often(719) complex(702) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ concept(1167) ontolog(924) domain(897) }
{ control(1307) perform(991) simul(935) }
{ data(2317) use(1299) case(1017) }
{ cost(1906) reduc(1198) effect(832) }
{ gene(2352) biolog(1181) express(1162) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ extract(1171) text(1153) clinic(932) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Recent advances in massively parallel sequencing technology have created new opportunities to probe the hidden world of microbes. Taxonomy-independent clustering of the 16S rRNA gene is usually the first step in analyzing microbial communities. Dozens of algorithms have been developed in the last decade, but a comprehensive benchmark study is lacking. Here, we survey algorithms currently used by microbiologists, and compare seven representative methods in a large-scale benchmark study that addresses several issues of concern. A new experimental protocol was developed that allows different algorithms to be compared using the same platform, and several criteria were introduced to facilitate a quantitative evaluation of the clustering performance of each algorithm. We found that existing methods vary widely in their outputs, and that inappropriate use of distance levels for taxonomic assignments likely resulted in substantial overestimates of biodiversity in many studies. The benchmark study identified our recently developed ESPRIT-Tree, a fast implementation of the average linkage-based hierarchical clustering algorithm, as one of the best algorithms available in terms of computational efficiency and clustering accuracy.

Resumo Limpo

recent advanc massiv parallel sequenc technolog creat new opportun probe hidden world microb taxonomyindepend cluster s rrna gene usual first step analyz microbi communiti dozen algorithm develop last decad comprehens benchmark studi lack survey algorithm current use microbiologist compar seven repres method largescal benchmark studi address sever issu concern new experiment protocol develop allow differ algorithm compar use platform sever criteria introduc facilit quantit evalu cluster perform algorithm found exist method vari wide output inappropri use distanc level taxonom assign like result substanti overestim biodivers mani studi benchmark studi identifi recent develop esprittre fast implement averag linkagebas hierarch cluster algorithm one best algorithm avail term comput effici cluster accuraci

Resumos Similares

Int J Health Geogr - Detection of arbitrarily-shaped clusters using a neighbor-expanding approach: a case study on murine typhus in south Texas. ( 0,658555126306454 )
IEEE Trans Pattern Anal Mach Intell - A Link-Based Approach to the Cluster Ensemble Problem. ( 0,633133986420792 )
IEEE Trans Vis Comput Graph - GPU-based Multilevel Clustering. ( 0,630444330183128 )
IEEE Trans Neural Netw Learn Syst - Improved Fault Classification in Series Compensated Transmission Line: Comparative Evaluation of Chebyshev Neural Network Training Algorithms. ( 0,6200646979183 )
Int J Health Geogr - A binary-based approach for detecting irregularly shaped clusters. ( 0,615526818648301 )
Comput. Biol. Med. - Evaluation of automatic feature detection algorithms in EEG: application to interburst intervals. ( 0,610509091875071 )
Int J Health Geogr - Detecting activity locations from raw GPS data: a novel kernel-based algorithm. ( 0,605194787675352 )
J Chem Inf Model - Comparison of combinatorial clustering methods on pharmacological data sets represented by machine learning-selected real molecular descriptors. ( 0,60451024842124 )
Brief. Bioinformatics - Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data. ( 0,597588087439113 )
Med Decis Making - Developing appropriate methods for cost-effectiveness analysis of cluster randomized trials. ( 0,596903979361016 )
Int J Health Geogr - Using statistical methods and genotyping to detect tuberculosis outbreaks. ( 0,593938973049336 )
J Chem Inf Model - Metabolism site prediction based on xenobiotic structural formulas and PASS prediction algorithm. ( 0,592748978363819 )
Spat Spatiotemporal Epidemiol - Optimal selection of the spatial scan parameters for cluster detection: a simulation study. ( 0,592557318432139 )
Brief. Bioinformatics - A survey of error-correction methods for next-generation sequencing. ( 0,576766605547774 )
J Med Syst - Application of attribute weighting method based on clustering centers to discrimination of linearly non-separable medical datasets. ( 0,572913546398755 )
Int J Health Geogr - Detection of clusters of a rare disease over a large territory: performance of cluster detection methods. ( 0,572259607832168 )
Res Synth Methods - Use of quality control charts for detection of outliers and temporal trends in cumulative meta-analysis. ( 0,568870733606159 )
J Biomed Inform - Clustering clinical models from local electronic health records based on semantic similarity. ( 0,566366976318349 )
Comput Biol Chem - Fast detection of high-order epistatic interactions in genome-wide association studies using information theoretic measure. ( 0,565045786046911 )
IEEE J Biomed Health Inform - Red blood cell cluster separation from digital images for use in sickle cell disease. ( 0,561308086483157 )
AMIA Annu Symp Proc - Using hierarchical mixture of experts model for fusion of outbreak detection methods. ( 0,561204329708106 )
J Integr Bioinform - Clustering of gene expression profiles: creating initialization-independent clusterings by eliminating unstable genes. ( 0,560913404911746 )
Int J Neural Syst - A genetic graph-based approach for partitional clustering. ( 0,560757580821753 )
J Biomed Inform - Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values. ( 0,556162321689026 )
Spat Spatiotemporal Epidemiol - Performance of cancer cluster Q-statistics for case-control residential histories. ( 0,5557862079058 )
J Biomed Inform - Mitigation of adverse interactions in pairs of clinical practice guidelines using constraint logic programming. ( 0,553182324818587 )
J Chem Inf Model - Investigation of the use of spectral clustering for the analysis of molecular data. ( 0,5467486526597 )
J. Comput. Biol. - A geometric clustering algorithm with applications to structural data. ( 0,544989895679492 )
Neural Comput - Spontaneous clustering via minimum -divergence. ( 0,543141799107678 )
Brief. Bioinformatics - GO-function: deriving biologically relevant functions from statistically significant functions. ( 0,541653739384095 )
J Integr Bioinform - Parallel Niche Pareto AlineaGA--an evolutionary multiobjective approach on multiple sequence alignment. ( 0,537965184195538 )
J Am Med Inform Assoc - The feasibility of automating audit and feedback for ART guideline adherence in Malawi. ( 0,533696288561 )
Comput Math Methods Med - A wavelet relational fuzzy C-means algorithm for 2D gel image segmentation. ( 0,532369752972675 )
Comput Methods Programs Biomed - Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices. ( 0,531312377319447 )
J. Comput. Biol. - EDAR: an efficient error detection and removal algorithm for next generation sequencing data. ( 0,531196799707943 )
Comput Methods Programs Biomed - Fuzzy and hard clustering analysis for thyroid disease. ( 0,530139901775494 )
Med Biol Eng Comput - Effective identification and localization of immature precursors in bone marrow biopsy. ( 0,528500027731579 )
J. Med. Internet Res. - Security analysis and improvements to the PsychoPass method. ( 0,528296865292674 )
J Am Med Inform Assoc - Building better guidelines with BRIDGE-Wiz: development and evaluation of a software assistant to promote clarity, transparency, and implementability. ( 0,52806095049189 )
J Chem Inf Model - Benchmark data sets for structure-based computational target prediction. ( 0,52674940011125 )
BMC Med Inform Decis Mak - Efficient algorithms for fast integration on large data sets from multiple sources. ( 0,52654178232442 )
Int J Comput Assist Radiol Surg - CT dataset anisotropy management for oral implantology planning software. ( 0,526228928475343 )
Comput Biol Chem - Mode of action classification of chemicals using multi-concentration time-dependent cellular response profiles. ( 0,526092382930837 )
IEEE Trans Pattern Anal Mach Intell - Semi-Supervised Kernel Mean Shift Clustering. ( 0,525326827763704 )
IEEE Trans Image Process - A Geometric Framework for Rectangular Shape Detection. ( 0,525174314453234 )
IEEE J Biomed Health Inform - Manikin-integrated digital measuring system for assessment of infant cardiopulmonary resuscitation techniques. ( 0,524853949904253 )
Med Decis Making - Multiple imputation methods for handling missing data in cost-effectiveness analyses that use data from hierarchical studies: an application to cluster randomized trials. ( 0,523564341264147 )
Int J Health Geogr - Interactive web-based mapping: bridging technology and data for health. ( 0,522611877021783 )
Int J Neural Syst - Adaptive k-means algorithm for overlapped graph clustering. ( 0,522283665475133 )
AMIA Annu Symp Proc - Dissimilarities in the Logical Modeling of Apparently Similar Concepts in SNOMED CT. ( 0,519604969225062 )
Comput. Aided Surg. - The Equidistant Method - a novel hip joint simulation algorithm for detection of femoroacetabular impingement. ( 0,519406180294004 )
IEEE Trans Image Process - Efficiently learning a detection cascade with sparse eigenvectors. ( 0,51807337361536 )
Int J Med Inform - USB-based Personal Health Records: an analysis of features and functionality. ( 0,517903391255729 )
Int J Med Inform - Reducing unnecessary lab testing in the ICU with artificial intelligence. ( 0,515188572679233 )
Brief. Bioinformatics - A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. ( 0,514228276461994 )
Brief. Bioinformatics - Data construction for phosphorylation site prediction. ( 0,514186630761208 )
Neural Comput - Markov chain Monte Carlo methods for state-space models with point process observations. ( 0,513802465248367 )
J Biomed Inform - Extension of the survival dimensionality reduction algorithm to detect epistasis in competing risks models (SDR-CR). ( 0,512906612655512 )
Comput Methods Programs Biomed - fMRI analysis on the GPU-possibilities and challenges. ( 0,511008491956051 )
BMC Med Inform Decis Mak - Studying the potential impact of automated document classification on scheduling a systematic review update. ( 0,51056814162587 )
Neural Comput - A nonparametric clustering algorithm with a quantile-based likelihood estimator. ( 0,510121917874812 )
Comput. Biol. Med. - Predicting cardiac autonomic neuropathy category for diabetic data with missing values. ( 0,509577887622168 )
Artif Intell Med - Vicinal support vector classifier using supervised kernel-based clustering. ( 0,509527813717687 )
IEEE Trans Pattern Anal Mach Intell - KNN Matting. ( 0,507346289431841 )
Int J Health Geogr - Voronoi distance based prospective space-time scans for point data sets: a dengue fever cluster analysis in a southeast Brazilian town. ( 0,504558981800634 )
J Biomed Inform - Design patterns for the development of electronic health record-driven phenotype extraction algorithms. ( 0,504191070843562 )
Brief. Bioinformatics - Correcting Illumina data. ( 0,502675127624477 )
Comput Methods Programs Biomed - OLYMPUS: an automated hybrid clustering method in time series gene expression. Case study: host response after Influenza A (H1N1) infection. ( 0,502300426220294 )
Comput Math Methods Med - Decimative spectral estimation with unconstrained model order. ( 0,49948872337932 )
BMC Med Inform Decis Mak - The R0 package: a toolbox to estimate reproduction numbers for epidemic outbreaks. ( 0,499240205985252 )
Comput Methods Programs Biomed - Development and application of efficient pathway enumeration algorithms for metabolic engineering applications. ( 0,497948874169241 )
J Am Med Inform Assoc - Comparison and validation of genomic predictors for anticancer drug sensitivity. ( 0,497268026841394 )
Health Info Libr J - A bibliometric approach demonstrates the impact of a social care data set on research and policy. ( 0,496173306071559 )
Int J Med Inform - From clinical practice guidelines, to clinical guidance in practice - impacts for computerization. ( 0,496164967421703 )
J Biomed Inform - A semantic framework to protect the privacy of electronic health records with non-numerical attributes. ( 0,495865553069212 )
Methods Inf Med - From clinical practice guidelines to computer-interpretable guidelines. A literature overview. ( 0,495737312900469 )
Int J Med Inform - Towards an understanding of the information dynamics of the handover process in aged care settings--a prerequisite for the safe and effective use of ICT. ( 0,493993300261897 )
J Clin Monit Comput - Tight reservoir bag: the bag itself may be the culprit. ( 0,492164463616722 )
Med Biol Eng Comput - Detection of swallows with silent aspiration using swallowing and breath sound analysis. ( 0,491212642981843 )
J Integr Bioinform - An evolutionary and visual framework for clustering of DNA microarray data. ( 0,491203931266956 )
J Chem Inf Model - Consensus methods for combining multiple clusterings of chemical structures. ( 0,491017689896648 )
AMIA Annu Symp Proc - An Algorithm Using Twelve Properties of Antibiotics to Find the Recommended Antibiotics, as in CPGs. ( 0,490104310387949 )
IEEE Trans Image Process - Enhancing Low-Rank Subspace Clustering by Manifold Regularization. ( 0,490067668966562 )
Health Info Libr J - Evidence-based medicine: is the evidence out there for primary care clinicians? ( 0,48913793247424 )
IEEE Trans Image Process - A novel video dataset for change detection benchmarking. ( 0,488722816456567 )
AMIA Annu Symp Proc - A fast algorithm for learning epistatic genomic relationships. ( 0,487912651048718 )
Artif Intell Med - Weighted spherical 1-mean with phase shift and its application in electrocardiogram discord detection. ( 0,487732146528654 )
IEEE Trans Image Process - Video keyframe analysis using a segment-based statistical metric in a visually sensitive parametric space. ( 0,48635813609339 )
Brief. Bioinformatics - Protein inference: a review. ( 0,486082833521955 )
J. Med. Internet Res. - Feasibility of a wiki as a participatory tool for patients in clinical guideline development. ( 0,485776761554131 )
Comput Math Methods Med - White blood cell segmentation by circle detection using electromagnetism-like optimization. ( 0,485745103009018 )
J Am Med Inform Assoc - Privacy-preserving heterogeneous health data sharing. ( 0,485605535195735 )
Int J Med Inform - Complexity and the science of implementation in health IT--knowledge gaps and future visions. ( 0,484636486808627 )
J Integr Bioinform - Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows. ( 0,48388388180088 )
Int J Health Geogr - Penalized likelihood and multi-objective spatial scans for the detection and inference of irregular clusters. ( 0,483165258127758 )
Res Synth Methods - A tool to assess the quality of a meta-analysis. ( 0,482830191498315 )
AMIA Annu Symp Proc - Extending the GuideLine Implementability Appraisal (GLIA) instrument to identify problems in control flow. ( 0,48209798275517 )
Comput Math Methods Med - Feature selection for better identification of subtypes of Guillain-Barr? syndrome. ( 0,481582273352817 )
Comput Methods Programs Biomed - A new approach based on Machine Learning for predicting corneal curvature (K1) and astigmatism in patients with keratoconus after intracorneal ring implantation. ( 0,481170688555768 )
J. Comput. Biol. - Inconsistent Denoising and Clustering Algorithms for Amplicon Sequence Data. ( 0,481012337980202 )