J Chem Inf Model - Dihedral-based segment identification and classification of biopolymers II: polynucleotides.

Tópicos

{ sequenc(1873) structur(1644) protein(1328) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2675) segment(2577) method(1081) }
{ result(1111) use(1088) new(759) }
{ framework(1458) process(801) describ(734) }
{ method(1969) cluster(1462) data(1082) }
{ research(1218) medic(880) student(794) }
{ use(976) code(926) identifi(902) }
{ can(774) often(719) complex(702) }
{ imag(1057) registr(996) error(939) }
{ motion(1329) object(1292) video(1091) }
{ care(1570) inform(1187) nurs(1089) }
{ data(3963) clinic(1234) research(1004) }
{ first(2504) two(1366) second(1323) }
{ inform(2794) health(2639) internet(1427) }
{ method(1219) similar(1157) match(930) }
{ take(945) account(800) differ(722) }
{ assess(1506) score(1403) qualiti(1306) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ group(2977) signific(1463) compar(1072) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ use(1733) differ(960) four(931) }
{ process(1125) use(805) approach(778) }
{ detect(2391) sensit(1101) algorithm(908) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ bind(1733) structur(1185) ligand(1036) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ model(2341) predict(2261) use(1141) }
{ compound(1573) activ(1297) structur(1058) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(2212) result(1239) propos(1039) }

Resumo

In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers I: Proteins. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400541d), we introduce a new algorithm for structure classification of biopolymeric structures based on main-chain dihedral angles. The DISICL algorithm (short for DIhedral-based Segment Identification and CLassification) classifies segments of structures containing two central residues. Here, we introduce the DISICL library for polynucleotides, which is based on the dihedral angles e, , and for the two central residues of a three-nucleotide segment of a single strand. Seventeen distinct structural classes are defined for nucleotide structures, some of which--to our knowledge--were not described previously in other structure classification algorithms. In particular, DISICL also classifies noncanonical single-stranded structural elements. DISICL is applied to databases of DNA and RNA structures containing 80,000 and 180,000 segments, respectively. The classifications according to DISICL are compared to those of another popular classification scheme in terms of the amount of classified nucleotides, average occurrence and length of structural elements, and pairwise matches of the classifications. While the detailed classification of DISICL adds sensitivity to a structure analysis, it can be readily reduced to eight simplified classes providing a more general overview of the secondary structure in polynucleotides.

Resumo Limpo

accompani paper nagi g oostenbrink c dihedralbas segment identif classif biopolym protein j chem inf model doi cid introduc new algorithm structur classif biopolymer structur base mainchain dihedr angl disicl algorithm short dihedralbas segment identif classif classifi segment structur contain two central residu introduc disicl librari polynucleotid base dihedr angl e two central residu threenucleotid segment singl strand seventeen distinct structur class defin nucleotid structur whichto knowledgewer describ previous structur classif algorithm particular disicl also classifi noncanon singlestrand structur element disicl appli databas dna rna structur contain segment respect classif accord disicl compar anoth popular classif scheme term amount classifi nucleotid averag occurr length structur element pairwis match classif detail classif disicl add sensit structur analysi can readili reduc eight simplifi class provid general overview secondari structur polynucleotid

Resumos Similares

J Integr Bioinform - A hierarchical approach to protein fold prediction. ( 0,890608120476409 )
Comput. Biol. Med. - Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets. ( 0,865017152836157 )
J Chem Inf Model - Context-based features enhance protein secondary structure prediction accuracy. ( 0,862013754975078 )
Comput Biol Chem - Identification and characterization of lysine-methylated sites on histones and non-histone proteins. ( 0,825250376322241 )
Comput. Biol. Med. - Signal peptide discrimination and cleavage site identification using SVM and NN. ( 0,824560307538876 )
J Chem Inf Model - Protein secondary structure prediction with SPARROW. ( 0,806292730908374 )
Comput Biol Chem - Inferring biological basis about psychrophilicity by interpreting the rules generated from the correctly classified input instances by a classifier. ( 0,794573889103376 )
J Chem Inf Model - Dihedral-based segment identification and classification of biopolymers I: proteins. ( 0,793719789677443 )
Comput Biol Chem - Characterizing regions in the human genome unmappable by next-generation-sequencing at the read length of 1000 bases. ( 0,791757057909151 )
Comput. Biol. Med. - Remote protein homology detection and fold recognition using two-layer support vector machine classifiers. ( 0,78734240538548 )
Comput Biol Chem - Statistical analysis and exposure status classification of transmembrane beta barrel residues. ( 0,784236665669676 )
J. Comput. Biol. - Evaluating, comparing, and interpreting protein domain hierarchies. ( 0,77575900220381 )
Comput Methods Programs Biomed - Protein secondary structure prediction using modular reciprocal bidirectional recurrent neural networks. ( 0,767800489067059 )
Comput Biol Chem - Bacterial protein structures reveal phylum dependent divergence. ( 0,76485307053554 )
Comput Biol Chem - A protein fold classifier formed by fusing different modes of pseudo amino acid composition via PSSM. ( 0,761150090081728 )
Comput Biol Chem - ProSTRIP: A method to find similar structural repeats in three-dimensional protein structures. ( 0,759761023864173 )
Comput Biol Chem - Analysis of sequence repeats of proteins in the PDB. ( 0,756258294609054 )
Comput. Biol. Med. - Improving protein secondary structure prediction using a multi-modal BP method. ( 0,752827109995209 )
Brief. Bioinformatics - De novo assembly of short sequence reads. ( 0,751495766387635 )
Comput Methods Programs Biomed - Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models. ( 0,75137715974801 )
Comput. Biol. Med. - Intron identification approaches based on weighted features and fuzzy decision trees. ( 0,751224190750912 )
Comput. Biol. Med. - Remote homology detection incorporating the context of physicochemical properties. ( 0,751203313012201 )
BMC Med Inform Decis Mak - Efficient protein structure search using indexing methods. ( 0,749798436387445 )
Comput Math Methods Med - Na?ve Bayes classifier with feature selection to identify phage virion proteins. ( 0,748631252523339 )
J Chem Inf Model - Protein secondary structure classification revisited: processing DSSP information with PSSC. ( 0,74676138666103 )
Comput. Biol. Med. - New layers in understanding and predicting a-linolenic acid content in plants using amino acid characteristics of omega-3 fatty acid desaturase. ( 0,74450487705376 )
Comput Biol Chem - Protein fold recognition based on functional domain composition. ( 0,743631324744325 )
Sci Data - Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus. ( 0,743289873029601 )
Comput Biol Chem - The frequency of poly(G) tracts in the human genome and their use as a sensor of DNA damage. ( 0,741058257568647 )
Curr Protoc Bioinformatics - Using the RNAstructure Software Package to Predict Conserved RNA Structures. ( 0,738787112957462 )
Comput Math Methods Med - Quad-PRE: a hybrid method to predict protein quaternary structure attributes. ( 0,736674547989029 )
Comput. Biol. Med. - An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. ( 0,736481006477266 )
J. Comput. Biol. - Simultaneous alignment and folding of protein sequences. ( 0,734984697578088 )
J Chem Inf Model - Tertiary structure prediction of RNA-RNA complexes using a secondary structure and fragment-based method. ( 0,733235620397576 )
Comput Biol Chem - Identification of putative and potential cross-reactive chickpea (Cicer arietinum) allergens through an in silico approach. ( 0,73179154884027 )
J Chem Inf Model - Protein structural statistics with PSS. ( 0,731353760241345 )
J. Comput. Biol. - Efficient traversal of beta-sheet protein folding pathways using ensemble models. ( 0,729568344432346 )
Comput Biol Chem - ProCoCoA: A quantitative approach for analyzing protein core composition. ( 0,728575080725082 )
J Chem Inf Model - ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures. ( 0,72833703547722 )
J Chem Inf Model - Parallel and antiparallel ?-strands differ in amino acid composition and availability of short constituent sequences. ( 0,727916338994992 )
Comput Biol Chem - Human-chimpanzee alignment: ortholog exponentials and paralog power laws. ( 0,726037284321007 )
Brief. Bioinformatics - New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. ( 0,724791403095231 )
Sci Data - Long-read, whole-genome shotgun sequence data for five model organisms. ( 0,722657529801835 )
Comput Biol Chem - Support vector machine with a Pearson VII function kernel for discriminating halophilic and non-halophilic proteins. ( 0,718388268923099 )
Brief. Bioinformatics - Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. ( 0,716192437062631 )
J Chem Inf Model - Improved helix and kink characterization in membrane proteins allows evaluation of kink sequence predictors. ( 0,715782950248987 )
Med Biol Eng Comput - The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: semi-supervised classification of class C GPCRs. ( 0,712883866309977 )
Comput Math Methods Med - Identification of DNA-binding proteins using support vector machine with sequence information. ( 0,711642689780734 )
Comput Biol Chem - The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam. ( 0,710930150563242 )
J Chem Inf Model - Modules identification in protein structures: the topological and geometrical solutions. ( 0,70819963826162 )
Comput Methods Programs Biomed - Discriminating protein structure classes by incorporating Pseudo Average Chemical Shift to Chou's general PseAAC and Support Vector Machine. ( 0,707121596185224 )
J Chem Inf Model - Comparative analysis of threshold and tessellation methods for determining protein contacts. ( 0,703243403892564 )
J. Comput. Biol. - ComB: SNP calling and mapping analysis for color and nucleotide space platforms. ( 0,700688396553318 )
Brief. Bioinformatics - Systematic identification of Class I HDAC substrates. ( 0,700091599193846 )
Brief. Bioinformatics - Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. ( 0,699114523627976 )
Comput Biol Chem - Computational insight into nitration of human myoglobin. ( 0,698785250526135 )
Comput Biol Chem - A local average connectivity-based method for identifying essential proteins from the network level. ( 0,695746897240599 )
Comput Math Methods Med - DV-curve representation of protein sequences and its application. ( 0,695375321885868 )
Comput. Biol. Med. - miRClassify: an advanced web server for miRNA family classification and annotation. ( 0,695172205033182 )
Brief. Bioinformatics - DRISEE overestimates errors in metagenomic sequencing data. ( 0,694027695293066 )
Comput. Biol. Med. - Improving protein complex classification accuracy using amino acid composition profile. ( 0,693180709884642 )
J Chem Inf Model - Building a knowledge-based statistical potential by capturing high-order inter-residue interactions and its applications in protein secondary structure assessment. ( 0,692289639033207 )
Comput. Biol. Med. - Application of 2D graphic representation of protein sequence based on Huffman tree method. ( 0,691984459123574 )
J. Comput. Biol. - Combinatorics of -structures. ( 0,69068485269579 )
J Integr Bioinform - Predicting genes involved in human cancer using network contextual information. ( 0,690375757964093 )
J. Comput. Biol. - Mapping reads on a genomic sequence: an algorithmic overview and a practical comparative analysis. ( 0,69016203624404 )
Comput Biol Chem - Prediction of protein modification sites of gamma-carboxylation using position specific scoring matrices based evolutionary information. ( 0,689555891916652 )
J. Comput. Biol. - Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. ( 0,68783430305211 )
J. Comput. Biol. - A novel technique for detecting putative horizontal gene transfer in the sequence space. ( 0,687126501924932 )
Comput Math Methods Med - Uses of phage display in agriculture: sequence analysis and comparative modeling of late embryogenesis abundant client proteins suggest protein-nucleic acid binding functionality. ( 0,684380802394256 )
J Biomed Inform - A similarity network approach for the analysis and comparison of protein sequence/structure sets. ( 0,682284639131115 )
Brief. Bioinformatics - Base-calling for next-generation sequencing platforms. ( 0,680251149273737 )
Comput Biol Chem - Multi-nucleation and vectorial folding pathways of large helix protein. ( 0,679691986677274 )
J. Comput. Biol. - Nonparametric combinatorial sequence models. ( 0,679101903908103 )
Brief. Bioinformatics - Ortholog identification in the presence of domain architecture rearrangement. ( 0,678152915056213 )
Comput. Biol. Med. - Keratin protein property based classification of mammals and non-mammals using machine learning techniques. ( 0,677640590554544 )
J. Comput. Biol. - Optimization of profile-to-profile alignment parameters for one-dimensional threading. ( 0,676891831499832 )
Curr Protoc Bioinformatics - Comparative Protein Structure Modeling Using MODELLER. ( 0,676524850114841 )
Comput Biol Chem - Computational determination of the orientation of a heat repeat-like domain of DNA-PKcs. ( 0,676340830492071 )
Comput. Biol. Med. - HIV-1 CRF01_AE coreceptor usage prediction using kernel methods based logistic model trees. ( 0,676034463363007 )
Brief. Bioinformatics - A practical guide for the computational selection of residues to be experimentally characterized in protein families. ( 0,673437986192614 )
J. Comput. Biol. - Statistical significance of threading scores. ( 0,673263656803448 )
Comput Math Methods Med - Identification of antioxidants from sequence information using na?ve Bayes. ( 0,67255559381371 )
J Integr Bioinform - Predicting protein distance maps according to physicochemical properties. ( 0,671417517519322 )
J. Comput. Biol. - Computational techniques for human genome resequencing using mated gapped reads. ( 0,670951749184377 )
Comput. Biol. Med. - A content and structural assessment of oxidative motifs across a diverse set of life forms. ( 0,667520531218269 )
Brief. Bioinformatics - BamView: visualizing and interpretation of next-generation sequencing read alignments. ( 0,666106411444457 )
Comput Biol Chem - Identical sequence patterns in the ends of exons and introns of human protein-coding genes. ( 0,666031464544154 )
Comput. Biol. Med. - Gene comparison based on the repetition of single-nucleotide structure patterns. ( 0,663797058562379 )
Comput Biol Chem - A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses. ( 0,66210183324834 )
Comput. Biol. Med. - Structural alphabet motif discovery and a structural motif database. ( 0,661141972669292 )
Sci Data - Comprehensive analysis of the venom gland transcriptome of the spider Dolomedes fimbriatus. ( 0,661096592342736 )
J. Comput. Biol. - IDBA-MTP: A Hybrid Metatranscriptomic Assembler Based on Protein Information. ( 0,659644143451538 )
J. Comput. Biol. - Sequence alignment of viral channel proteins with cellular ion channels. ( 0,655238919341887 )
Sci Data - A draft genome for the African crocodilian trypanosome Trypanosoma grayi. ( 0,65404010452756 )
J Integr Bioinform - Exceptional single strand DNA word symmetry: analysis of evolutionary potentialities. ( 0,653337251172957 )
Comput Biol Chem - Systematic analysis of an amidase domain CHAP in 12 Staphylococcus aureus genomes and 44 staphylococcal phage genomes. ( 0,653026531378793 )
Comput. Biol. Med. - Haemophilus influenzae Genome Database (HIGDB): a single point web resource for Haemophilus influenzae. ( 0,652787593557716 )
Comput Biol Chem - newDNA-Prot: Prediction of DNA-binding proteins by employing support vector machine and a comprehensive sequence representation. ( 0,652709319326142 )
J Biomed Inform - Protein contact map prediction using multi-stage hybrid intelligence inference systems. ( 0,651917040163578 )