Comput Biol Chem - Identical sequence patterns in the ends of exons and introns of human protein-coding genes.

Tópicos

{ sequenc(1873) structur(1644) protein(1328) }
{ gene(2352) biolog(1181) express(1162) }
{ health(1844) social(1437) communiti(874) }
{ take(945) account(800) differ(722) }
{ howev(809) still(633) remain(590) }
{ first(2504) two(1366) second(1323) }
{ time(1939) patient(1703) rate(768) }
{ studi(2440) review(1878) systemat(933) }
{ chang(1828) time(1643) increas(1301) }
{ featur(1941) imag(1645) propos(1176) }
{ studi(1119) effect(1106) posit(819) }
{ high(1669) rate(1365) level(1280) }
{ drug(1928) target(777) effect(648) }
{ inform(2794) health(2639) internet(1427) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ perform(999) metric(946) measur(919) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ state(1844) use(1261) util(961) }
{ group(2977) signific(1463) compar(1072) }
{ activ(1138) subject(705) human(624) }
{ implement(1333) system(1263) develop(1122) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ method(1219) similar(1157) match(930) }
{ featur(3375) classif(2383) classifi(1994) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ model(2341) predict(2261) use(1141) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

Intron splicing is one of the most important steps involved in the maturation process of a pre-mRNA. Although the sequence profiles around the splice sites have been studied extensively, the levels of sequence identity between the exonic sequences preceding the donor sites and the intronic sequences preceding the acceptor sites has not been examined as thoroughly. In this study we investigated identity patterns between the last 15 nucleotides of the exonic sequence preceding the 5' splice site and the intronic sequence preceding the 3' splice site in a set of human protein-coding genes that do not exhibit intron retention. We found that almost 60% of consecutive exons and introns in human protein-coding genes share at least two identical nucleotides at their 3' ends and, on average, the sequence identity length is 2.47 nucleotides. Based on our findings we conclude that the 3' ends of exons and introns tend to have longer identical sequences within a gene than when being taken from different genes. Our results hold even if the pairs are non-consecutive in the transcription order.

Resumo Limpo

intron splice one import step involv matur process premrna although sequenc profil around splice site studi extens level sequenc ident exon sequenc preced donor site intron sequenc preced acceptor site examin thorough studi investig ident pattern last nucleotid exon sequenc preced splice site intron sequenc preced splice site set human proteincod gene exhibit intron retent found almost consecut exon intron human proteincod gene share least two ident nucleotid end averag sequenc ident length nucleotid base find conclud end exon intron tend longer ident sequenc within gene taken differ gene result hold even pair nonconsecut transcript order

Resumos Similares

Brief. Bioinformatics - Systematic identification of Class I HDAC substrates. ( 0,89689772254997 )
Comput Biol Chem - Large replication skew domains delimit GC-poor gene deserts in human. ( 0,849943317270939 )
Comput Biol Chem - Computational insight into nitration of human myoglobin. ( 0,827912019727577 )
Brief. Bioinformatics - New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. ( 0,827517788541639 )
Comput Biol Chem - Multi-nucleation and vectorial folding pathways of large helix protein. ( 0,821140983137458 )
Comput Biol Chem - Bacterial protein structures reveal phylum dependent divergence. ( 0,817993325116909 )
Comput Biol Chem - The frequency of poly(G) tracts in the human genome and their use as a sensor of DNA damage. ( 0,808215758116498 )
Comput Biol Chem - Gene expression regulation of the PF00480 or PF14340 domain proteins suggests their involvement in sulfur metabolism. ( 0,807071080015376 )
Comput Biol Chem - Characterizing regions in the human genome unmappable by next-generation-sequencing at the read length of 1000 bases. ( 0,806575534314731 )
Comput Biol Chem - Protein fold recognition based on functional domain composition. ( 0,806399220147405 )
Brief. Bioinformatics - De novo assembly of short sequence reads. ( 0,806362084089137 )
Comput Biol Chem - ProSTRIP: A method to find similar structural repeats in three-dimensional protein structures. ( 0,805826022751646 )
Brief. Bioinformatics - Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. ( 0,805550174537313 )
Comput Biol Chem - Analysis of sequence repeats of proteins in the PDB. ( 0,804867314763424 )
J Chem Inf Model - Tertiary structure prediction of RNA-RNA complexes using a secondary structure and fragment-based method. ( 0,802393209519041 )
J Chem Inf Model - ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures. ( 0,800475882957055 )
Comput Biol Chem - In silico characterization and evolutionary analyses of CCAAT binding proteins in the lycophyte plant Selaginella moellendorffii genome: a growing comparative genomics resource. ( 0,794242157327822 )
Comput Biol Chem - Identification and characterization of lysine-methylated sites on histones and non-histone proteins. ( 0,792350999297077 )
Comput Biol Chem - Statistical analysis and exposure status classification of transmembrane beta barrel residues. ( 0,78687359853023 )
J Chem Inf Model - Building a knowledge-based statistical potential by capturing high-order inter-residue interactions and its applications in protein secondary structure assessment. ( 0,783220590991585 )
Comput Biol Chem - Identification of potential drug targets by subtractive genome analysis of Bacillus anthracis A0248: An in silico approach. ( 0,783134174393439 )
Comput Biol Chem - Identification of putative and potential cross-reactive chickpea (Cicer arietinum) allergens through an in silico approach. ( 0,777151202498618 )
Comput Biol Chem - Human-chimpanzee alignment: ortholog exponentials and paralog power laws. ( 0,776749826054421 )
J Chem Inf Model - Improved helix and kink characterization in membrane proteins allows evaluation of kink sequence predictors. ( 0,774231780243638 )
J. Comput. Biol. - A probabilistic model of neutral and selective dynamics of protein network evolution. ( 0,773685271705954 )
Comput Biol Chem - Gene cloning, homology comparison and analysis of the main functional structure domains of beta estrogen receptor in Jining Gray goat. ( 0,772272692950225 )
J Chem Inf Model - Protein secondary structure prediction with SPARROW. ( 0,772231707837903 )
Comput. Biol. Med. - An insight into the molecular basis for convergent evolution in fish antifreeze Proteins. ( 0,772143540489313 )
J. Comput. Biol. - IDBA-MTP: A Hybrid Metatranscriptomic Assembler Based on Protein Information. ( 0,769404340250191 )
J. Comput. Biol. - Nonparametric combinatorial sequence models. ( 0,769265363849022 )
BMC Med Inform Decis Mak - Efficient protein structure search using indexing methods. ( 0,768727974337373 )
J Chem Inf Model - Parallel and antiparallel ?-strands differ in amino acid composition and availability of short constituent sequences. ( 0,767956386653215 )
Brief. Bioinformatics - Application of second-generation sequencing to cancer genomics. ( 0,767218792386373 )
Comput. Biol. Med. - A content and structural assessment of oxidative motifs across a diverse set of life forms. ( 0,764716812479333 )
Comput Biol Chem - ProCoCoA: A quantitative approach for analyzing protein core composition. ( 0,762594217027772 )
J. Comput. Biol. - Simultaneous alignment and folding of protein sequences. ( 0,762489213155795 )
Comput. Biol. Med. - Improving protein secondary structure prediction using a multi-modal BP method. ( 0,762379005618148 )
J. Comput. Biol. - Evaluating, comparing, and interpreting protein domain hierarchies. ( 0,759824409657006 )
J. Comput. Biol. - Efficient traversal of beta-sheet protein folding pathways using ensemble models. ( 0,759014212007072 )
J. Comput. Biol. - ComB: SNP calling and mapping analysis for color and nucleotide space platforms. ( 0,757144218043311 )
Comput. Biol. Med. - miRClassify: an advanced web server for miRNA family classification and annotation. ( 0,755816273185088 )
Comput Biol Chem - A local average connectivity-based method for identifying essential proteins from the network level. ( 0,754844827845765 )
J Chem Inf Model - Protein secondary structure classification revisited: processing DSSP information with PSSC. ( 0,753277888605174 )
Comput Methods Programs Biomed - Protein secondary structure prediction using modular reciprocal bidirectional recurrent neural networks. ( 0,753041133055184 )
Comput Math Methods Med - DV-curve representation of protein sequences and its application. ( 0,747674516908462 )
J Chem Inf Model - Comparative analysis of threshold and tessellation methods for determining protein contacts. ( 0,747272262943315 )
Sci Data - Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus. ( 0,745551990728404 )
Comput Biol Chem - Genome-wide analysis and evolutionary study of sucrose non-fermenting 1-related protein kinase 2 (SnRK2) gene family members in Arabidopsis and Oryza. ( 0,744286702062211 )
Brief. Bioinformatics - Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. ( 0,740731088660114 )
J. Comput. Biol. - Learning cellular sorting pathways using protein interactions and sequence motifs. ( 0,739128130390566 )
Comput Math Methods Med - Uses of phage display in agriculture: sequence analysis and comparative modeling of late embryogenesis abundant client proteins suggest protein-nucleic acid binding functionality. ( 0,738794945731556 )
Comput Biol Chem - The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam. ( 0,738561420332434 )
J. Comput. Biol. - Mapping reads on a genomic sequence: an algorithmic overview and a practical comparative analysis. ( 0,738307445949288 )
Med Biol Eng Comput - The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: semi-supervised classification of class C GPCRs. ( 0,734916042550271 )
J. Comput. Biol. - AREM: aligning short reads from ChIP-sequencing by expectation maximization. ( 0,734060966739799 )
J Chem Inf Model - Modules identification in protein structures: the topological and geometrical solutions. ( 0,729524961842426 )
Sci Data - Comprehensive analysis of the venom gland transcriptome of the spider Dolomedes fimbriatus. ( 0,729170131679923 )
J. Comput. Biol. - Combinatorics of -structures. ( 0,72692053283576 )
Brief. Bioinformatics - Sequencing technologies and tools for short tandem repeat variation detection. ( 0,726862639238561 )
J Chem Inf Model - Proteins as sponges: a statistical journey along protein structure organization principles. ( 0,726812397412757 )
Sci Data - A draft genome for the African crocodilian trypanosome Trypanosoma grayi. ( 0,726233137379101 )
Curr Protoc Bioinformatics - Using the RNAstructure Software Package to Predict Conserved RNA Structures. ( 0,723934707166752 )
Sci Data - Long-read, whole-genome shotgun sequence data for five model organisms. ( 0,721709213892982 )
J. Comput. Biol. - Reconstructing the history of large-scale genomic changes: biological questions and computational challenges. ( 0,71915574826818 )
Comput Math Methods Med - Quad-PRE: a hybrid method to predict protein quaternary structure attributes. ( 0,718592698797458 )
Comput Biol Chem - Computational determination of the orientation of a heat repeat-like domain of DNA-PKcs. ( 0,718585494219651 )
Comput Biol Chem - A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses. ( 0,716580817117514 )
J. Comput. Biol. - Statistical significance of threading scores. ( 0,715831417187356 )
Comput. Biol. Med. - New layers in understanding and predicting a-linolenic acid content in plants using amino acid characteristics of omega-3 fatty acid desaturase. ( 0,714619566427334 )
Comput. Biol. Med. - Remote homology detection incorporating the context of physicochemical properties. ( 0,71251240603066 )
J Integr Bioinform - Exceptional single strand DNA word symmetry: analysis of evolutionary potentialities. ( 0,712310443866744 )
Brief. Bioinformatics - A practical guide for the computational selection of residues to be experimentally characterized in protein families. ( 0,712063454681043 )
J Chem Inf Model - Protein structural statistics with PSS. ( 0,709000855424928 )
Artif Intell Med - Predicting malaria interactome classifications from time-course transcriptomic data along the intraerythrocytic developmental cycle. ( 0,708826513863829 )
J. Comput. Biol. - A novel technique for detecting putative horizontal gene transfer in the sequence space. ( 0,707887246584752 )
J Integr Bioinform - A hierarchical approach to protein fold prediction. ( 0,703467196450813 )
Brief. Bioinformatics - Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes). ( 0,70284495075456 )
Brief. Bioinformatics - Ortholog identification in the presence of domain architecture rearrangement. ( 0,700986774374859 )
Comput Methods Programs Biomed - Pinda: a web service for detection and analysis of intraspecies gene duplication events. ( 0,700734389186051 )
Comput. Biol. Med. - Signal peptide discrimination and cleavage site identification using SVM and NN. ( 0,699692979521546 )
BMC Med Inform Decis Mak - Improved method for protein complex detection using bottleneck proteins. ( 0,697704829845347 )
Comput. Biol. Med. - Structural alphabet motif discovery and a structural motif database. ( 0,6976933577678 )
Comput Biol Chem - Error compensation of tRNA misacylation by codon-anticodon mismatch prevents translational amino acid misinsertion. ( 0,697310292206232 )
Brief. Bioinformatics - BamView: visualizing and interpretation of next-generation sequencing read alignments. ( 0,697162227996726 )
J. Comput. Biol. - An automaton approach for waiting times in DNA evolution. ( 0,696996035150379 )
Comput Biol Chem - Genes under positive selection in Mycobacterium tuberculosis. ( 0,696267236787072 )
Comput. Biol. Med. - Intron identification approaches based on weighted features and fuzzy decision trees. ( 0,694245588340667 )
Comput. Biol. Med. - A context evaluation approach for structural comparison of proteins using cross entropy over n-gram modelling. ( 0,692326640988824 )
Comput Biol Chem - Predicting protein-protein interactions using graph invariants and a neural network. ( 0,691667946951595 )
Comput Biol Chem - Global expression analysis of miRNA gene cluster and family based on isomiRs from deep sequencing data. ( 0,689813563930594 )
Med Biol Eng Comput - Enhanced spatio-temporal alignment of plantar pressure image sequences using B-splines. ( 0,688194900786581 )
Comput. Biol. Med. - The possible role of HSPs on Beh?et's disease: a bioinformatic approach. ( 0,68757125326312 )
Brief. Bioinformatics - Computational challenges of sequence classification in microbiomic data. ( 0,687351824084705 )
J Integr Bioinform - Prediction of thioredoxin and glutaredoxin target proteins by identifying reversibly oxidized cysteinyl residues. ( 0,686852190542187 )
J. Comput. Biol. - Sequence alignment of viral channel proteins with cellular ion channels. ( 0,685851320577965 )
J Chem Inf Model - Kink characterization and modeling in transmembrane protein structures. ( 0,685679677309884 )
Comput Biol Chem - Practical halving; the Nelumbo nucifera evidence on early eudicot evolution. ( 0,685367493577431 )
Wiley Interdiscip Rev Syst Biol Med - Mass spectrometry-based proteomics: qualitative identification to activity-based protein profiling. ( 0,6849646869191 )
Brief. Bioinformatics - Applications of alignment-free methods in epigenomics. ( 0,683905773710495 )
Comput Biol Chem - Analysis of the relationships between evolvability, thermodynamics, and the functions of intrinsically disordered proteins/regions. ( 0,683119941903015 )