J Chem Inf Model - String kernels and high-quality data set for improved prediction of kinked helices in a-helical membrane proteins.

Tópicos

{ method(1969) cluster(1462) data(1082) }
{ sequenc(1873) structur(1644) protein(1328) }
{ model(2341) predict(2261) use(1141) }
{ learn(2355) train(1041) set(1003) }
{ howev(809) still(633) remain(590) }
{ data(2317) use(1299) case(1017) }
{ drug(1928) target(777) effect(648) }
{ process(1125) use(805) approach(778) }
{ data(1737) use(1416) pattern(1282) }
{ import(1318) role(1303) understand(862) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ age(1611) year(1155) adult(843) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ imag(1057) registr(996) error(939) }
{ method(1219) similar(1157) match(930) }
{ imag(2675) segment(2577) method(1081) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ error(1145) method(1030) estim(1020) }
{ extract(1171) text(1153) clinic(932) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ studi(1410) differ(1259) use(1210) }
{ perform(999) metric(946) measur(919) }
{ studi(1119) effect(1106) posit(819) }
{ health(3367) inform(1360) care(1135) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ medic(1828) order(1363) alert(1069) }
{ time(1939) patient(1703) rate(768) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ estim(2440) model(1874) function(577) }
{ activ(1452) weight(1219) physic(1104) }
{ can(774) often(719) complex(702) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ patient(1821) servic(1111) care(1106) }
{ can(981) present(881) function(850) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ decis(3086) make(1611) patient(1517) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

The reasons for distortions from optimal a-helical geometry are widely unknown, but their influences on structural changes of proteins are significant. Hence, their prediction is a crucial problem in structural bioinformatics. For the particular case of kink prediction, we generated a data set of 132 membrane proteins containing 1014 manually labeled helices and examined the environment of kinks. Our sequence analysis confirms the great relevance of proline and reveals disproportionately high occurrences of glycine and serine at kink positions. The structural analysis shows significantly different solvent accessible surface area mean values for kinked and nonkinked helices. More important, we used this data set to validate string kernels for support vector machines as a new kink prediction method. Applying the new predictor, about 80% of all helices could be correctly predicted as kinked or nonkinked even when focusing on small helical fragments. The results exceed recently reported accuracies of alternative approaches and are a consequence of both the method and the data set.

Resumo Limpo

reason distort optim ahel geometri wide unknown influenc structur chang protein signific henc predict crucial problem structur bioinformat particular case kink predict generat data set membran protein contain manual label helic examin environ kink sequenc analysi confirm great relev prolin reveal disproportion high occurr glycin serin kink posit structur analysi show signific differ solvent access surfac area mean valu kink nonkink helic import use data set valid string kernel support vector machin new kink predict method appli new predictor helic correct predict kink nonkink even focus small helic fragment result exceed recent report accuraci altern approach consequ method data set

Resumos Similares

Comput Methods Programs Biomed - Discriminating protein structure classes by incorporating Pseudo Average Chemical Shift to Chou's general PseAAC and Support Vector Machine. ( 0,722653798198433 )
Artif Intell Med - Missing data imputation using statistical and machine learning methods in a real breast cancer problem. ( 0,64258043795755 )
J Biomed Inform - Quantifying the determinants of outbreak detection performance through simulation and machine learning. ( 0,627737172797235 )
J Biomed Inform - A kinetic model-based algorithm to classify NGS short reads by their allele origin. ( 0,626558666267976 )
J. Comput. Biol. - Enhancing Gibbs sampling method for motif finding in DNA with initial graph representation of sequences. ( 0,618611249569281 )
Brief. Bioinformatics - Ultrafast clustering algorithms for metagenomic sequence analysis. ( 0,614235870472254 )
J. Comput. Biol. - Evaluating, comparing, and interpreting protein domain hierarchies. ( 0,612012502142175 )
J. Comput. Biol. - EDAR: an efficient error detection and removal algorithm for next generation sequencing data. ( 0,606341640000873 )
J. Comput. Biol. - Inconsistent Denoising and Clustering Algorithms for Amplicon Sequence Data. ( 0,604547555019941 )
Comput. Biol. Med. - Prediction of protein functions based on function-function correlation relations. ( 0,601351660719618 )
J Chem Inf Model - Comparison of combinatorial clustering methods on pharmacological data sets represented by machine learning-selected real molecular descriptors. ( 0,59474868744644 )
IEEE Trans Image Process - Subspaces indexing model on Grassmann manifold for image search. ( 0,593592930373176 )
J Biomed Inform - Learning Bayesian networks from survival data using weighting censored instances. ( 0,593142204067017 )
Comput. Biol. Med. - StackTIS: a stacked generalization approach for effective prediction of translation initiation sites. ( 0,589692416219389 )
Comput. Biol. Med. - A learning method for the class imbalance problem with medical data sets. ( 0,589249286136051 )
Comput Methods Programs Biomed - Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices. ( 0,586132143807661 )
J Chem Inf Model - Cavities tell more than sequences: exploring functional relationships of proteases via binding pockets. ( 0,583103502257604 )
J Am Med Inform Assoc - Evaluating the utility of syndromic surveillance algorithms for screening to detect potentially clonal hospital infection outbreaks. ( 0,579711881923169 )
Comput Biol Chem - A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses. ( 0,577784122543497 )
AMIA Annu Symp Proc - Survival prediction and treatment recommendation with Bayesian techniques in lung cancer. ( 0,576131870063372 )
J Integr Bioinform - BacillusRegNet: a transcriptional regulation database and analysis platform for Bacillus species. ( 0,573804970297504 )
J Chem Inf Model - Building a knowledge-based statistical potential by capturing high-order inter-residue interactions and its applications in protein secondary structure assessment. ( 0,572888521751597 )
Brief. Bioinformatics - Benchmarking of viral haplotype reconstruction programmes: an overview of the capacities and limitations of currently available programmes. ( 0,572613179937423 )
J Biomed Inform - Statistical process control for validating a classification tree model for predicting mortality--a novel approach towards temporal validation. ( 0,571572075924662 )
J Integr Bioinform - Parallel Niche Pareto AlineaGA--an evolutionary multiobjective approach on multiple sequence alignment. ( 0,569715038501174 )
J Chem Inf Model - Benchmark data sets for structure-based computational target prediction. ( 0,565999237670766 )
Comput Biol Chem - PPM-Dom: a novel method for domain position prediction. ( 0,565006454850733 )
IEEE Trans Pattern Anal Mach Intell - Semi-Supervised Kernel Mean Shift Clustering. ( 0,564005918346742 )
J Am Med Inform Assoc - Predicting complications of percutaneous coronary intervention using a novel support vector method. ( 0,558372110520819 )
Comput Biol Chem - Support vector machine with a Pearson VII function kernel for discriminating halophilic and non-halophilic proteins. ( 0,554369391454518 )
J Chem Inf Model - Pragmatic approaches to using computational methods to predict xenobiotic metabolism. ( 0,553831107671871 )
J Chem Inf Model - Metabolism site prediction based on xenobiotic structural formulas and PASS prediction algorithm. ( 0,551102583470616 )
AMIA Annu Symp Proc - Using hierarchical mixture of experts model for fusion of outbreak detection methods. ( 0,550286508249308 )
BMC Med Inform Decis Mak - Efficient algorithms for fast integration on large data sets from multiple sources. ( 0,549573298213955 )
J. Comput. Biol. - Detection of structural variants involving repetitive regions in the reference genome. ( 0,54681548513073 )
Artif Intell Med - Machine learning of clinical performance in a pancreatic cancer database. ( 0,546566750963954 )
Med Biol Eng Comput - The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: semi-supervised classification of class C GPCRs. ( 0,54466815246413 )
J. Comput. Biol. - The irredundant class method for remote homology detection of protein sequences. ( 0,543467161587923 )
Artif Intell Med - A classifier ensemble approach for the missing feature problem. ( 0,543327934805041 )
J Chem Inf Model - Prospects for tertiary structure prediction of RNA based on secondary structure information. ( 0,543050424669867 )
Sci Data - Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus. ( 0,542968577621693 )
Comput. Biol. Med. - A methodology to identify consensus classes from clustering algorithms applied to immunohistochemical data from breast cancer patients. ( 0,542888970697317 )
Artif Intell Med - Improved modeling of clinical data with kernel methods. ( 0,542632941572107 )
J. Comput. Biol. - Determining a singleton attractor of a boolean network with nested canalyzing functions. ( 0,5426282546612 )
Comput. Biol. Med. - ExonSuite: algorithmically optimizing alternative gene splicing for the PUF proteins. ( 0,542152849720504 )
J. Comput. Biol. - Optimization of profile-to-profile alignment parameters for one-dimensional threading. ( 0,541586897150376 )
Artif Intell Med - Weighted spherical 1-mean with phase shift and its application in electrocardiogram discord detection. ( 0,539383742101318 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,538733051832525 )
J Chem Inf Model - Improved helix and kink characterization in membrane proteins allows evaluation of kink sequence predictors. ( 0,538593481214042 )
J Am Med Inform Assoc - Supervised embedding of textual predictors with applications in clinical diagnostics for pediatric cardiology. ( 0,538326165410823 )
Comput Methods Programs Biomed - OLYMPUS: an automated hybrid clustering method in time series gene expression. Case study: host response after Influenza A (H1N1) infection. ( 0,537767851841118 )
Artif Intell Med - Prediction of human major histocompatibility complex class II binding peptides by continuous kernel discrimination method. ( 0,5367735097313 )
Comput. Biol. Med. - Signal peptide discrimination and cleavage site identification using SVM and NN. ( 0,53612616652826 )
Int J Health Geogr - Assessing the effects of variables and background selection on the capture of the tick climate niche. ( 0,53612178570017 )
Comput Math Methods Med - A robust rerank approach for feature selection and its application to pooling-based GWA studies. ( 0,534817833966141 )
Neural Comput - Feature selection for ordinal text classification. ( 0,534153135228007 )
J Chem Inf Model - Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis. ( 0,532642592650757 )
J. Comput. Biol. - Nonparametric combinatorial sequence models. ( 0,531874378172848 )
Int J Health Geogr - Detecting activity locations from raw GPS data: a novel kernel-based algorithm. ( 0,531356932078385 )
J Am Med Inform Assoc - HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads. ( 0,531306320559151 )
Int J Health Geogr - Detection of clusters of a rare disease over a large territory: performance of cluster detection methods. ( 0,530663619207328 )
J Chem Inf Model - Investigation of the use of spectral clustering for the analysis of molecular data. ( 0,530057914145533 )
J. Med. Internet Res. - Security analysis and improvements to the PsychoPass method. ( 0,53003996277571 )
Comput. Biol. Med. - New layers in understanding and predicting a-linolenic acid content in plants using amino acid characteristics of omega-3 fatty acid desaturase. ( 0,529753800689678 )
Comput Biol Chem - piClust: a density based piRNA clustering algorithm. ( 0,529744364992076 )
Comput Math Methods Med - Novel harmonic regularization approach for variable selection in Cox's proportional hazards model. ( 0,529483857526834 )
IEEE Trans Neural Netw Learn Syst - Learning Stable Multilevel Dictionaries for Sparse Representations. ( 0,52926442138244 )
Int J Health Geogr - Identifying malaria vector breeding habitats with remote sensing data and terrain-based landscape indices in Zambia. ( 0,52784209326501 )
J. Comput. Biol. - Simultaneous alignment and folding of protein sequences. ( 0,527708176940121 )
J. Comput. Biol. - A theoretical model for whole genome alignment. ( 0,527055855417406 )
Comput. Biol. Med. - In silico identification of Gram-negative bacterial secreted proteins from primary sequence. ( 0,526438274366287 )
Brief. Bioinformatics - Alpha shape and Delaunay triangulation in studies of protein-related interactions. ( 0,526409603981551 )
J Chem Inf Model - Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. ( 0,525840253738803 )
J Am Med Inform Assoc - Structural models used in real-time biosurveillance outbreak detection and outbreak curve isolation from noisy background morbidity levels. ( 0,525338845380928 )
J Chem Inf Model - Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. ( 0,524677550011032 )
J Chem Inf Model - New benchmark for chemical nomenclature software. ( 0,524210538747619 )
Int J Health Geogr - Detection of arbitrarily-shaped clusters using a neighbor-expanding approach: a case study on murine typhus in south Texas. ( 0,523425748640888 )
IEEE Trans Pattern Anal Mach Intell - A Link-Based Approach to the Cluster Ensemble Problem. ( 0,52324457863756 )
Comput. Biol. Med. - Keratin protein property based classification of mammals and non-mammals using machine learning techniques. ( 0,522252330413999 )
J Chem Inf Model - Toward a better pharmacophore description of P-glycoprotein modulators, based on macrocyclic diterpenes from Euphorbia species. ( 0,522155955607054 )
Med Decis Making - The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes. ( 0,521539068558712 )
Comput Math Methods Med - Comparison of semiparametric, parametric, and nonparametric ROC analysis for continuous diagnostic tests using a simulation study and acute coronary syndrome data. ( 0,521102241789146 )
Comput Biol Chem - An efficient similarity search based on indexing in large DNA databases. ( 0,519584498535331 )
Int J Med Robot - Coordinated control and experimentation of the dental arch generator of the tooth-arrangement robot. ( 0,519302621014005 )
Comput Methods Programs Biomed - Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. ( 0,518731365403863 )
Artif Intell Med - Vicinal support vector classifier using supervised kernel-based clustering. ( 0,518049336267782 )
J Med Syst - Application of attribute weighting method based on clustering centers to discrimination of linearly non-separable medical datasets. ( 0,516981826945196 )
Comput. Biol. Med. - An evolutionary approach for searching metabolic pathways. ( 0,51682111225463 )
Comput. Biol. Med. - Application of 2D graphic representation of protein sequence based on Huffman tree method. ( 0,51606303502724 )
Comput. Biol. Med. - An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. ( 0,515023070369645 )
Brief. Bioinformatics - Challenges of sequencing human genomes. ( 0,514952152000397 )
Curr Protoc Bioinformatics - Using the RNAstructure Software Package to Predict Conserved RNA Structures. ( 0,514771062490056 )
Int J Health Geogr - A binary-based approach for detecting irregularly shaped clusters. ( 0,514708340810089 )
Comput Methods Programs Biomed - Single stage and multistage classification models for the prediction of liver fibrosis degree in patients with chronic hepatitis C infection. ( 0,514543587385251 )
Med Decis Making - Developing appropriate methods for cost-effectiveness analysis of cluster randomized trials. ( 0,513437612916305 )
Brief. Bioinformatics - Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. ( 0,513398748304262 )
J Chem Inf Model - Kink characterization and modeling in transmembrane protein structures. ( 0,51186674646519 )
Sci Data - Long-read, whole-genome shotgun sequence data for five model organisms. ( 0,511765056059556 )
Comput Biol Chem - Relationship between global structural parameters and Enzyme Commission hierarchy: implications for function prediction. ( 0,511716875914359 )
Brief. Bioinformatics - A survey on prediction of specificity-determining sites in proteins. ( 0,510792247928426 )