J Biomed Inform - Gene-disease association with literature based enrichment.

Tópicos

{ gene(2352) biolog(1181) express(1162) }
{ model(2341) predict(2261) use(1141) }
{ search(2224) databas(1162) retriev(909) }
{ extract(1171) text(1153) clinic(932) }
{ system(1976) rule(880) can(841) }
{ compound(1573) activ(1297) structur(1058) }
{ sampl(1606) size(1419) use(1276) }
{ perform(999) metric(946) measur(919) }
{ patient(2315) diseas(1263) diabet(1191) }
{ structur(1116) can(940) graph(676) }
{ model(2656) set(1616) predict(1553) }
{ can(774) often(719) complex(702) }
{ research(1218) medic(880) student(794) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ clinic(1479) use(1117) guidelin(835) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ perform(1367) use(1326) method(1137) }
{ spatial(1525) area(1432) region(1030) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ analysi(2126) use(1163) compon(1037) }
{ cancer(2502) breast(956) screen(824) }
{ implement(1333) system(1263) develop(1122) }
{ method(2212) result(1239) propos(1039) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ concept(1167) ontolog(924) domain(897) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ featur(1941) imag(1645) propos(1176) }
{ studi(1410) differ(1259) use(1210) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

TIVATION: Gene set enrichment analysis (GSEA) annotates gene microarray data with functional information from the biomedical literature to improve gene-disease association prediction. We hypothesize that supplementing GSEA with comprehensive gene function catalogs built automatically using information extracted from the scientific literature will significantly enhance GSEA prediction quality.METHODS: Gold standard gene sets for breast cancer (BrCa) and colorectal cancer (CRC) were derived from the literature. Two gene function catalogs (CMeSH and CUMLS) were automatically generated. 1. By using Entrez Gene to associate all recorded human genes with PubMed article IDs. 2. Using the genes mentioned in each PubMed article and associating each with the article's MeSH terms (in CMeSH) and extracted UMLS concepts (in CUMLS). Microarray data from the Gene Expression Omnibus for BrCa and CRC was then annotated using CMeSH and CUMLS and for comparison, also with several pre-existing catalogs (C2, C4 and C5 from the Molecular Signatures Database). Ranking was done using, a standard GSEA implementation (GSEA-p). Gene function predictions for enriched array data were evaluated against the gold standard by measuring area under the receiver operating characteristic curve (AUC).RESULTS: Comparison of ranking using the literature enrichment catalogs, the pre-existing catalogs as well as five randomly generated catalogs show the literature derived enrichment catalogs are more effective. The AUC for BrCa using the unenriched gene expression dataset was 0.43, increasing to 0.89 after gene set enrichment with CUMLS. The AUC for CRC using the unenriched gene expression dataset was 0.54, increasing to 0.9 after enrichment with CMeSH. C2 increased AUC (BrCa 0.76, CRC 0.71) but C4 and C5 performed poorly (between 0.35 and 0.5). The randomly generated catalogs also performed poorly, equivalent to random guessing.DISCUSSION: Gene set enrichment significantly improved prediction of gene-disease association. Selection of enrichment catalog had a substantial effect on prediction accuracy. The literature based catalogs performed better than the MSigDB catalogs, possibly because they are more recent. Catalogs generated automatically from the literature can be kept up to date.CONCLUSION: Prediction of gene-disease association is a fundamental task in biomedical research. GSEA provides a promising method when using literature-based enrichment catalogs.AVAILABILITY: The literature based catalogs generated and used in this study are available from http://www2.chi.unsw.edu.au/literature-enrichment.

Resumo Limpo

tivat gene set enrich analysi gsea annot gene microarray data function inform biomed literatur improv genediseas associ predict hypothes supplement gsea comprehens gene function catalog built automat use inform extract scientif literatur will signific enhanc gsea predict qualitymethod gold standard gene set breast cancer brca colorect cancer crc deriv literatur two gene function catalog cmesh cuml automat generat use entrez gene associ record human gene pubm articl id use gene mention pubm articl associ articl mesh term cmesh extract uml concept cuml microarray data gene express omnibus brca crc annot use cmesh cuml comparison also sever preexist catalog c c c molecular signatur databas rank done use standard gsea implement gseap gene function predict enrich array data evalu gold standard measur area receiv oper characterist curv aucresult comparison rank use literatur enrich catalog preexist catalog well five random generat catalog show literatur deriv enrich catalog effect auc brca use unenrich gene express dataset increas gene set enrich cuml auc crc use unenrich gene express dataset increas enrich cmesh c increas auc brca crc c c perform poor random generat catalog also perform poor equival random guessingdiscuss gene set enrich signific improv predict genediseas associ select enrich catalog substanti effect predict accuraci literatur base catalog perform better msigdb catalog possibl recent catalog generat automat literatur can kept dateconclus predict genediseas associ fundament task biomed research gsea provid promis method use literaturebas enrich catalogsavail literatur base catalog generat use studi avail httpwwwchiunsweduauliteratureenrich

Resumos Similares

Int J Med Inform - Translating genome wide association study results to associations among common diseases: in silico study with an electronic medical record. ( 0,785944890112401 )
J Biomed Inform - Partial least squares and logistic regression random-effects estimates for gene selection in supervised classification of gene expression data. ( 0,776086215687786 )
J Integr Bioinform - Predicting breast cancer chemotherapeutic response using a novel tool for microarray data analysis. ( 0,742203824563363 )
Comput Math Methods Med - Modified logistic regression models using gene coexpression and clinical features to predict prostate cancer progression. ( 0,73617633659292 )
Comput. Biol. Med. - Degrees of separation as a statistical tool for evaluating candidate genes. ( 0,731118107283894 )
AMIA Annu Symp Proc - Comparing the value of mammographic features and genetic variants in breast cancer risk prediction. ( 0,730331928937968 )
Brief. Bioinformatics - myMIR: a genome-wide microRNA targets identification and annotation tool. ( 0,729597188566803 )
Comput Math Methods Med - Integrating gene expression and protein interaction data for signaling pathway prediction of Alzheimer's disease. ( 0,717708526126388 )
Comput Biol Chem - An elastic network model to identify characteristic stress response genes. ( 0,716297591962108 )
J Am Med Inform Assoc - Knowledge boosting: a graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction. ( 0,709942739591723 )
J Am Med Inform Assoc - Utility of gene-specific algorithms for predicting pathogenicity of uncertain gene variants. ( 0,708832227608018 )
Brief. Bioinformatics - Adjusting confounders in ranking biomarkers: a model-based ROC approach. ( 0,701991085458453 )
Methods Inf Med - Identification of breast cancer prognosis markers using integrative sparse boosting. ( 0,673181421222261 )
J Biomed Inform - Synergistic effect of different levels of genomic data for cancer clinical outcome prediction. ( 0,670437131140683 )
J Integr Bioinform - Classification of breast cancer subtypes by combining gene expression and DNA methylation data. ( 0,667283094599255 )
Artif Intell Med - Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data. ( 0,666678337899704 )
Comput Biol Chem - Identifying novel prostate cancer associated pathways based on integrative microarray data analysis. ( 0,665067826955537 )
Comput Biol Chem - Sparse regularized discriminant analysis with application to microarrays. ( 0,660711003243743 )
J. Comput. Biol. - Vavien: an algorithm for prioritizing candidate disease genes based on topological similarity of proteins in interaction networks. ( 0,6554867932359 )
J Chem Inf Model - Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis. ( 0,654579395231677 )
Brief. Bioinformatics - Biological network extraction from scientific literature: state of the art and challenges. ( 0,653034472845741 )
J Integr Bioinform - Assembling cell context-specific gene sets: a case in cardiomyopathy. ( 0,651039766445276 )
J Biomed Inform - The inference of breast cancer metastasis through gene regulatory networks. ( 0,650170462169064 )
Brief. Bioinformatics - Combining multidimensional genomic measurements for predicting cancer prognosis: observations from TCGA. ( 0,64669126963253 )
Wiley Interdiscip Rev Syst Biol Med - Integrating omics into the cardiac differentiation of human pluripotent stem cells. ( 0,646133380930999 )
Brief. Bioinformatics - Revealing the architecture of genetic and epigenetic regulation: a maximum likelihood model. ( 0,645658209311077 )
J. Comput. Biol. - A topology-based score for pathway enrichment. ( 0,644167376422014 )
J Biomed Inform - Comparative analysis of a novel disease phenotype network based on clinical manifestations. ( 0,642419764806979 )
Wiley Interdiscip Rev Syst Biol Med - Engineered genetic information processing circuits. ( 0,628291594648065 )
Comput Biol Chem - GPEC: a Cytoscape plug-in for random walk-based gene prioritization and biomedical evidence collection. ( 0,627462133688374 )
J. Comput. Biol. - A new software package for predictive gene regulatory network modeling and redesign. ( 0,624947061413354 )
J Am Med Inform Assoc - A literature search tool for intelligent extraction of disease-associated genes. ( 0,62466448121734 )
J Integr Bioinform - An integrative bioinformatics framework for genome-scale multiple level network reconstruction of rice. ( 0,623260352060096 )
Brief. Bioinformatics - Identifying miRNAs, targets and functions. ( 0,623232323232323 )
Comput. Biol. Med. - A knowledge-driven probabilistic framework for the prediction of protein-protein interaction networks. ( 0,621911870081374 )
J Integr Bioinform - Towards prediction and prioritization of disease genes by the modularity of human phenome-genome assembled network. ( 0,621735108461343 )
AMIA Annu Symp Proc - Mining disease fingerprints from within genetic pathways. ( 0,621303926094279 )
Sci Data - Comprehensive RNA-Seq transcriptomic profiling across 11 organs, 4 ages, and 2 sexes of Fischer 344 rats. ( 0,621293623213121 )
Comput Biol Chem - Using volcano plots and regularized-chi statistics in genetic association studies. ( 0,620899167786974 )
J Integr Bioinform - SNPRanker: a tool for identification and scoring of SNPs associated to target genes. ( 0,620289318509101 )
J Biomed Inform - Prioritization of potential candidate disease genes by topological similarity of protein-protein interaction network and phenotype data. ( 0,619934797934491 )
Artif Intell Med - An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. ( 0,619285972821636 )
J Am Med Inform Assoc - Extracting coordinated patterns of DNA methylation and gene expression in ovarian cancer. ( 0,619083878493851 )
Wiley Interdiscip Rev Syst Biol Med - Systems biology approaches to epidemiological studies of complex diseases. ( 0,618606711546765 )
J Biomed Inform - A comparative study of covariance selection models for the inference of gene regulatory networks. ( 0,618538280399115 )
Wiley Interdiscip Rev Syst Biol Med - miRNA regulation in the context of functional protein networks: principles and applications. ( 0,618221516542514 )
Comput. Biol. Med. - Identification and analysis of the regulatory network of Myc and microRNAs from high-throughput experimental data. ( 0,61358268855386 )
AMIA Annu Symp Proc - An ontology-neutral framework for enrichment analysis. ( 0,613276772818619 )
J Am Med Inform Assoc - An integrated approach to identify causal network modules of complex diseases with application to colorectal cancer. ( 0,613122496945685 )
J. Comput. Biol. - An algorithm for efficient identification of branched metabolic pathways. ( 0,613095665376994 )
J. Comput. Biol. - Prediction of siRNA potency using sparse logistic regression. ( 0,610988931172841 )
J. Comput. Biol. - Bioinformatics method to analyze the mechanism of pancreatic cancer disorder. ( 0,610127326116062 )
J Biomed Inform - Transcriptional networks characterize ventricular dysfunction after myocardial infarction: a proof-of-concept investigation. ( 0,610057141426997 )
J Integr Bioinform - Network expansion and pathway enrichment analysis towards biologically significant findings from microarrays. ( 0,609586015352366 )
J Am Med Inform Assoc - Identifying disease genes and module biomarkers by differential interactions. ( 0,6091439240754 )
BMC Med Inform Decis Mak - A three-step approach for the derivation and validation of high-performing predictive models using an operational dataset: congestive heart failure readmission case study. ( 0,608994320119828 )
J Integr Bioinform - A study of the short and long-term regulation of E. coli metabolic pathways. ( 0,608897270852619 )
Wiley Interdiscip Rev Syst Biol Med - Network biology: a direct approach to study biological function. ( 0,608697642933518 )
Brief. Bioinformatics - Predictive modelling of gene expression from transcriptional regulatory elements. ( 0,607953025724066 )
Comput Math Methods Med - Understanding the pathogenesis of Kawasaki disease by network and pathway analysis. ( 0,606602733099569 )
Wiley Interdiscip Rev Syst Biol Med - Reverse-engineering human regulatory networks. ( 0,606502540963109 )
Wiley Interdiscip Rev Syst Biol Med - Diverse functional networks of Tbx3 in development and disease. ( 0,604816817293426 )
J. Comput. Biol. - Biological network querying techniques: analysis and comparison. ( 0,603966178869866 )
Artif Intell Med - Identifying regulatory relationships among genomic loci, biological pathways, and disease. ( 0,603344761705863 )
Sci Data - DNA methylation temporal profiling following peripheral versus central nervous system axotomy. ( 0,60242986536137 )
Brief. Bioinformatics - Identification of aberrant pathways and network activities from high-throughput data. ( 0,602063230691778 )
J Integr Bioinform - Knowledge enrichment analysis for human tissue-specific genes uncover new biological insights. ( 0,600292127824136 )
Comput Biol Chem - Identification of all trinucleotide circular codes. ( 0,599972884457725 )
J Biomed Inform - The detection of risk pathways, regulated by miRNAs, via the integration of sample-matched miRNA-mRNA profiles and pathway structure. ( 0,599872027119683 )
Comput. Biol. Med. - Mathematical modeling and sensitivity analysis of the integrated TNFa-mediated apoptotic pathway for identifying key regulators. ( 0,599387295078685 )
Comput. Biol. Med. - A ternary model of decompression sickness in rats. ( 0,59924359950438 )
Wiley Interdiscip Rev Syst Biol Med - Using a systems biology approach to understand and study the mechanisms of metastasis. ( 0,59839613305566 )
Comput Biol Chem - Identification of novel splice variants and exons of human endothelial cell-specific chemotaxic regulator (ECSCR) by bioinformatics analysis. ( 0,597913990462575 )
J Biomed Inform - Independent component analysis: mining microarray data for fundamental human gene expression modules. ( 0,596905932788727 )
Wiley Interdiscip Rev Syst Biol Med - Using variability in gene expression as a tool for studying gene regulation. ( 0,596595278162613 )
Brief. Bioinformatics - Combining literature text mining with microarray data: advances for system biology modeling. ( 0,596102945365438 )
Wiley Interdiscip Rev Syst Biol Med - Stem cell bioengineering at the interface of systems-based models and high-throughput platforms. ( 0,595056969563231 )
Wiley Interdiscip Rev Syst Biol Med - Protein microarrays for genome-wide posttranslational modification analysis. ( 0,594804655956243 )
Comput Biol Chem - Disruption of murine Tcte3-3 induces tissue specific apoptosis via co-expression of Anxa5 and Pebp1. ( 0,594627043547335 )
J Am Med Inform Assoc - Complex-disease networks of trait-associated single-nucleotide polymorphisms (SNPs) unveiled by information theory. ( 0,594228948384587 )
Comput Biol Chem - Identification of miR159s and their target genes and expression analysis under drought stress in potato. ( 0,593437125431698 )
Comput Biol Chem - In silico analysis of cis-acting regulatory elements in 5' regulatory regions of sucrose transporter gene families in rice (Oryza sativa Japonica) and Arabidopsis thaliana. ( 0,592285476746517 )
J Biomed Inform - Gene pathways and subnetworks distinguish between major glioma subtypes and elucidate potential underlying biology. ( 0,591880702160635 )
J Clin Monit Comput - Use of genetic programming, logistic regression, and artificial neural nets to predict readmission after coronary artery bypass surgery. ( 0,589263480328115 )
Comput Math Methods Med - First comprehensive in silico analysis of the functional and structural consequences of SNPs in human GalNAc-T1 gene. ( 0,58856099332406 )
Brief. Bioinformatics - Targeted metabolic reconstruction: a novel approach for the characterization of plant-pathogen interactions. ( 0,587370997800294 )
J Am Med Inform Assoc - Network models of genome-wide association studies uncover the topological centrality of protein interactions in complex diseases. ( 0,587204314801365 )
Wiley Interdiscip Rev Syst Biol Med - Postgenomic technologies targeting the Wnt signaling network. ( 0,586486325829942 )
AMIA Annu Symp Proc - Genetic variants improve breast cancer risk prediction on mammograms. ( 0,586363102483367 )
Comput Biol Chem - Revealing weak differential gene expressions and their reproducible functions associated with breast cancer metastasis. ( 0,585854502483357 )
Brief. Bioinformatics - Pharmaco-miR: linking microRNAs and drug effects. ( 0,585633254359013 )
Wiley Interdiscip Rev Syst Biol Med - Signaling networks in palate development. ( 0,584727811346118 )
Wiley Interdiscip Rev Syst Biol Med - Systems biology of adipose tissue metabolism: regulation of growth, signaling and inflammation. ( 0,584601324079785 )
J Biomed Inform - A two step method to identify clinical outcome relevant genes with microarray data. ( 0,584351921484367 )
Wiley Interdiscip Rev Syst Biol Med - Recent advances in prostate development and links to prostatic diseases. ( 0,58402036008314 )
Comput. Biol. Med. - Exploring correlations in gene expression microarray data for maximum predictive-minimum redundancy biomarker selection and classification. ( 0,583881535734891 )
Comput Biol Chem - Expression patterns of photoperiod and temperature regulated heading date genes in Oryza sativa. ( 0,582473324397125 )
J Biomed Inform - ProNormz--an integrated approach for human proteins and protein kinases normalization. ( 0,58229674182487 )
J Biomed Inform - Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease. ( 0,581328436570835 )
Brief. Bioinformatics - Toward microRNA-mediated gene regulatory networks in plants. ( 0,581228558316036 )