Artif Intell Med - Detecting disease genes based on semi-supervised learning and protein-protein interaction networks.

Tópicos

{ gene(2352) biolog(1181) express(1162) }
{ data(2317) use(1299) case(1017) }
{ data(3008) multipl(1320) sourc(1022) }
{ method(1969) cluster(1462) data(1082) }
{ learn(2355) train(1041) set(1003) }
{ sequenc(1873) structur(1644) protein(1328) }
{ model(2656) set(1616) predict(1553) }
{ featur(3375) classif(2383) classifi(1994) }
{ research(1218) medic(880) student(794) }
{ can(981) present(881) function(850) }
{ method(2212) result(1239) propos(1039) }
{ imag(1947) propos(1133) code(1026) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ general(901) number(790) one(736) }
{ search(2224) databas(1162) retriev(909) }
{ patient(2837) hospit(1953) medic(668) }
{ detect(2391) sensit(1101) algorithm(908) }
{ inform(2794) health(2639) internet(1427) }
{ imag(2830) propos(1344) filter(1198) }
{ take(945) account(800) differ(722) }
{ chang(1828) time(1643) increas(1301) }
{ method(1557) propos(1049) approach(1037) }
{ blood(1257) pressur(1144) flow(957) }
{ activ(1138) subject(705) human(624) }
{ health(1844) social(1437) communiti(874) }
{ method(1219) similar(1157) match(930) }
{ network(2748) neural(1063) input(814) }
{ treatment(1704) effect(941) patient(846) }
{ framework(1458) process(801) describ(734) }
{ concept(1167) ontolog(924) domain(897) }
{ extract(1171) text(1153) clinic(932) }
{ data(3963) clinic(1234) research(1004) }
{ research(1085) discuss(1038) issu(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ health(3367) inform(1360) care(1135) }
{ age(1611) year(1155) adult(843) }
{ sampl(1606) size(1419) use(1276) }
{ high(1669) rate(1365) level(1280) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ implement(1333) system(1263) develop(1122) }
{ estim(2440) model(1874) function(577) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ problem(2511) optim(1539) algorithm(950) }
{ error(1145) method(1030) estim(1020) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }

Resumo

JECTIVE: Predicting or prioritizing the human genes that cause disease, or "disease genes", is one of the emerging tasks in biomedicine informatics. Research on network-based approach to this problem is carried out upon the key assumption of "the network-neighbour of a disease gene is likely to cause the same or a similar disease", and mostly employs data regarding well-known disease genes, using supervised learning methods. This work aims to find an effective method to exploit the disease gene neighbourhood and the integration of several useful omics data sources, which potentially enhance disease gene predictions.METHODS: We have presented a novel method to effectively predict disease genes by exploiting, in the semi-supervised learning (SSL) scheme, data regarding both disease genes and disease gene neighbours via protein-protein interaction network. Multiple proteomic and genomic data were integrated from six biological databases, including Universal Protein Resource, Interologous Interaction Database, Reactome, Gene Ontology, Pfam, and InterDom, and a gene expression dataset.RESULTS: By employing a 10 times stratified 10-fold cross validation, the SSL method performs better than the k-nearest neighbour method and the support vector machines method in terms of sensitivity of 85%, specificity of 79%, precision of 81%, accuracy of 82%, and a balanced F-function of 83%. The other comparative experimental evaluations demonstrate advantages of the proposed method given a small amount of labeled data with accuracy of 78%. We have applied the proposed method to detect 572 putative disease genes, which are biologically validated by some indirect ways.CONCLUSION: Semi-supervised learning improved ability to study disease genes, especially a specific disease when the known disease genes (as labeled data) are very often limited. In addition to the computational improvement, the analysis of predicted disease proteins indicates that the findings are beneficial in deciphering the pathogenic mechanisms.

Resumo Limpo

jectiv predict priorit human gene caus diseas diseas gene one emerg task biomedicin informat research networkbas approach problem carri upon key assumpt networkneighbour diseas gene like caus similar diseas most employ data regard wellknown diseas gene use supervis learn method work aim find effect method exploit diseas gene neighbourhood integr sever use omic data sourc potenti enhanc diseas gene predictionsmethod present novel method effect predict diseas gene exploit semisupervis learn ssl scheme data regard diseas gene diseas gene neighbour via proteinprotein interact network multipl proteom genom data integr six biolog databas includ univers protein resourc interolog interact databas reactom gene ontolog pfam interdom gene express datasetresult employ time stratifi fold cross valid ssl method perform better knearest neighbour method support vector machin method term sensit specif precis accuraci balanc ffunction compar experiment evalu demonstr advantag propos method given small amount label data accuraci appli propos method detect putat diseas gene biolog valid indirect waysconclus semisupervis learn improv abil studi diseas gene especi specif diseas known diseas gene label data often limit addit comput improv analysi predict diseas protein indic find benefici deciph pathogen mechan

Resumos Similares

Brief. Bioinformatics - Rich annotation of DNA sequencing variants by leveraging the Ensembl Variant Effect Predictor with plugins. ( 0,794912443751188 )
J Biomed Inform - Comparative analysis of a novel disease phenotype network based on clinical manifestations. ( 0,773014046081233 )
Comput. Biol. Med. - Support vector machine algorithms in the search of KIR gene associations with disease. ( 0,741798483226285 )
Comput. Biol. Med. - Gene expression data classification using locally linear discriminant embedding. ( 0,733536805762931 )
Curr Protoc Bioinformatics - BEDTools: The Swiss-Army Tool for Genome Feature Analysis. ( 0,730799288964569 )
Brief. Bioinformatics - Building an HIV data mashup using Bio2RDF. ( 0,730658960421391 )
Comput. Biol. Med. - Revealing pathway maps of renal cell carcinoma by gene expression change. ( 0,726174389658535 )
J Biomed Inform - The detection of risk pathways, regulated by miRNAs, via the integration of sample-matched miRNA-mRNA profiles and pathway structure. ( 0,719081525785936 )
Comput Biol Chem - Using gene expression programming to infer gene regulatory networks from time-series data. ( 0,708427029935978 )
J. Comput. Biol. - NP-MuScL: unsupervised global prediction of interaction networks from multiple data sources. ( 0,707409793832969 )
J Integr Bioinform - Analysis and construction of pathogenicity island regulatory pathways in Salmonella enterica serovar Typhi. ( 0,706571218295353 )
Artif Intell Med - An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. ( 0,704519356995317 )
J Integr Bioinform - Profiling of genetic switches using boolean implications in expression data. ( 0,704187447723783 )
Comput. Biol. Med. - Identification and analysis of the regulatory network of Myc and microRNAs from high-throughput experimental data. ( 0,701376129332732 )
J Biomed Inform - Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease. ( 0,701102101452018 )
J Integr Bioinform - Integrating phenotypic data for depression. ( 0,700263543571657 )
Comput Biol Chem - Meta-analysis of microarray data: The case of imatinib resistance in chronic myelogenous leukemia. ( 0,699768733875043 )
J Am Med Inform Assoc - Identifying disease genes and module biomarkers by differential interactions. ( 0,696960772670009 )
IEEE J Biomed Health Inform - Using evolutional properties of gene networks in understanding survival prognosis of glioblastoma. ( 0,691597872433852 )
Wiley Interdiscip Rev Syst Biol Med - Postgenomic technologies targeting the Wnt signaling network. ( 0,690886655435978 )
J Biomed Inform - Independent component analysis: mining microarray data for fundamental human gene expression modules. ( 0,689869647069146 )
AMIA Annu Symp Proc - Mining disease fingerprints from within genetic pathways. ( 0,689853115132696 )
Sci Data - Comprehensive RNA-Seq transcriptomic profiling across 11 organs, 4 ages, and 2 sexes of Fischer 344 rats. ( 0,689711354583131 )
Sci Data - Genomes and phenomes of a population of outbred rats and its progenitors. ( 0,689106105950614 )
Artif Intell Med - Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data. ( 0,686889098494216 )
Wiley Interdiscip Rev Syst Biol Med - Noncoding RNAs in gene regulation. ( 0,684716882588947 )
Comput Biol Chem - Identifying novel prostate cancer associated pathways based on integrative microarray data analysis. ( 0,684498789300377 )
J Integr Bioinform - Towards prediction and prioritization of disease genes by the modularity of human phenome-genome assembled network. ( 0,681763069897853 )
J Biomed Inform - Prioritization of potential candidate disease genes by topological similarity of protein-protein interaction network and phenotype data. ( 0,68147822923672 )
Brief. Bioinformatics - Evidence for short-time divergence and long-time conservation of tissue-specific expression after gene duplication. ( 0,680864052693544 )
Brief. Bioinformatics - Revealing the architecture of genetic and epigenetic regulation: a maximum likelihood model. ( 0,680306838418974 )
J. Comput. Biol. - A topology-based score for pathway enrichment. ( 0,677764552520585 )
J. Comput. Biol. - Increasing power of groupwise association test with likelihood ratio test. ( 0,677309960807682 )
J Integr Bioinform - Bioinformatics tools help molecular characterization of Perkinsus olseni differentially expressed genes. ( 0,676933154598119 )
Comput Math Methods Med - First comprehensive in silico analysis of the functional and structural consequences of SNPs in human GalNAc-T1 gene. ( 0,67502373794691 )
Sci Data - Assessment of lipidomic species in hepatocyte lipid droplets from stressed mouse models. ( 0,674935159980355 )
Brief. Bioinformatics - Targeted metabolic reconstruction: a novel approach for the characterization of plant-pathogen interactions. ( 0,672648757651316 )
J. Comput. Biol. - Algorithms for MDC-based multi-locus phylogeny inference: beyond rooted binary gene trees on single alleles. ( 0,672324228794911 )
Brief. Bioinformatics - Experimental evidence validating the computational inference of functional associations from gene fusion events: a critical survey. ( 0,67068777212401 )
Sci Data - DNA methylation temporal profiling following peripheral versus central nervous system axotomy. ( 0,670164026797057 )
Spat Spatiotemporal Epidemiol - The integration of molecular tools into veterinary and spatial epidemiology. ( 0,670160718704902 )
J Am Med Inform Assoc - Extracting coordinated patterns of DNA methylation and gene expression in ovarian cancer. ( 0,668174276512649 )
J Biomed Inform - A machine-learned knowledge discovery method for associating complex phenotypes with complex genotypes. Application to pain. ( 0,665457818743706 )
J Am Med Inform Assoc - Complex-disease networks of trait-associated single-nucleotide polymorphisms (SNPs) unveiled by information theory. ( 0,664050564622352 )
J. Comput. Biol. - Stochastic simulation of notch signaling reveals novel factors that mediate the differentiation of neural stem cells. ( 0,66161741968333 )
Methods Inf Med - Identification of breast cancer prognosis markers using integrative sparse boosting. ( 0,661409458729362 )
Artif Intell Med - Identifying regulatory relationships among genomic loci, biological pathways, and disease. ( 0,660618354718567 )
Comput Biol Chem - In silico analysis of cis-acting regulatory elements in 5' regulatory regions of sucrose transporter gene families in rice (Oryza sativa Japonica) and Arabidopsis thaliana. ( 0,659451916201151 )
Comput Methods Programs Biomed - TC-VGC: a tumor classification system using variations in genes' correlation. ( 0,659401990503002 )
Int J Med Inform - Translating genome wide association study results to associations among common diseases: in silico study with an electronic medical record. ( 0,657939172319839 )
Brief. Bioinformatics - Identifying miRNAs, targets and functions. ( 0,657799277331641 )
Wiley Interdiscip Rev Syst Biol Med - Diverse functional networks of Tbx3 in development and disease. ( 0,657374501543417 )
Wiley Interdiscip Rev Syst Biol Med - Using a systems biology approach to understand and study the mechanisms of metastasis. ( 0,656748371939458 )
J Am Med Inform Assoc - Network models of genome-wide association studies uncover the topological centrality of protein interactions in complex diseases. ( 0,656555636513526 )
J. Comput. Biol. - Vavien: an algorithm for prioritizing candidate disease genes based on topological similarity of proteins in interaction networks. ( 0,656064015251594 )
Brief. Bioinformatics - Learning transcriptional regulation on a genome scale: a theoretical analysis based on gene expression data. ( 0,655870741640123 )
Comput. Biol. Med. - Exploring correlations in gene expression microarray data for maximum predictive-minimum redundancy biomarker selection and classification. ( 0,654211702845771 )
J Integr Bioinform - A study of the short and long-term regulation of E. coli metabolic pathways. ( 0,654125715532681 )
J. Comput. Biol. - An algorithm for efficient identification of branched metabolic pathways. ( 0,653659197741058 )
Comput. Biol. Med. - Meta analysis of gene expression changes upon treatment of A549 cells with anti-cancer drugs to identify universal responses. ( 0,652777756697147 )
Comput Biol Chem - A computational method of predicting regulatory interactions in Arabidopsis based on gene expression data and sequence information. ( 0,651612082131616 )
J Biomed Inform - Mining patterns in disease classification forests. ( 0,65144691381091 )
J Biomed Inform - Gene pathways and subnetworks distinguish between major glioma subtypes and elucidate potential underlying biology. ( 0,650727378202338 )
Comput Math Methods Med - Understanding the pathogenesis of Kawasaki disease by network and pathway analysis. ( 0,649498540735813 )
Comput. Biol. Med. - A molecular prospective provides new insights into implication of PDYN and OPRK1 genes in alcohol dependence. ( 0,648700140261017 )
Comput. Biol. Med. - A review on the computational approaches for gene regulatory network construction. ( 0,647972318856646 )
Wiley Interdiscip Rev Syst Biol Med - Genome network medicine: innovation to overcome huge challenges in cancer therapy. ( 0,647737333955708 )
J. Comput. Biol. - Bioinformatics method to analyze the mechanism of pancreatic cancer disorder. ( 0,647709488931959 )
J Am Med Inform Assoc - An integrated approach to identify causal network modules of complex diseases with application to colorectal cancer. ( 0,647534668936976 )
J Integr Bioinform - An integrative bioinformatics framework for genome-scale multiple level network reconstruction of rice. ( 0,647137581742555 )
J Am Med Inform Assoc - Using systems and structure biology tools to dissect cellular phenotypes. ( 0,646700174909332 )
Brief. Bioinformatics - Online tools for understanding rat physiology. ( 0,646437497589366 )
Wiley Interdiscip Rev Syst Biol Med - Systems biology approaches to epidemiological studies of complex diseases. ( 0,645927943656872 )
J Biomed Inform - miRWalk--database: prediction of possible miRNA binding sites by walking the genes of three genomes. ( 0,645733689768132 )
Brief. Bioinformatics - Exon array data analysis using Affymetrix power tools and R statistical software. ( 0,64559157587053 )
J. Comput. Biol. - Markov logic networks in the analysis of genetic data. ( 0,645124409786185 )
Med Biol Eng Comput - A method for detecting significant genomic regions associated with oral squamous cell carcinoma using aCGH. ( 0,645101376478484 )
J Integr Bioinform - Network expansion and pathway enrichment analysis towards biologically significant findings from microarrays. ( 0,644896193108209 )
IEEE J Biomed Health Inform - Integrative clustering by nonnegative matrix factorization can reveal coherent functional groups from gene profile data. ( 0,644674511978671 )
J. Comput. Biol. - eALPS: estimating abundance levels in pooled sequencing using available genotyping data. ( 0,644465420849452 )
J Integr Bioinform - BioNetLink - an architecture for working with network data. ( 0,644238795765639 )
J Integr Bioinform - Construction of coffee transcriptome networks based on gene annotation semantics. ( 0,644230816488258 )
Comput Biol Chem - Expression patterns of photoperiod and temperature regulated heading date genes in Oryza sativa. ( 0,643924299514625 )
Comput Biol Chem - Ped_Outlier software for automatic identification of within-family outliers. ( 0,643884426575523 )
Comput. Biol. Med. - Computational gene network study on antibiotic resistance genes of Acinetobacter baumannii. ( 0,643661057569988 )
J Integr Bioinform - Uncovering the expression patterns of chimeric transcripts using surveys of affymetrix GeneChips. ( 0,643601252277923 )
Comput Biol Chem - GPEC: a Cytoscape plug-in for random walk-based gene prioritization and biomedical evidence collection. ( 0,642991087552875 )
J. Comput. Biol. - Biological network querying techniques: analysis and comparison. ( 0,642564423572124 )
Comput Biol Chem - Disruption of murine Tcte3-3 induces tissue specific apoptosis via co-expression of Anxa5 and Pebp1. ( 0,642560376541169 )
Comput Biol Chem - Exploring the complexity of pathway-drug relationships using latent Dirichlet allocation. ( 0,64163511804407 )
Brief. Bioinformatics - Integrative approaches for predicting protein function and prioritizing genes for complex phenotypes using protein interaction networks. ( 0,641492981205375 )
Comput Biol Chem - Sparse regularized discriminant analysis with application to microarrays. ( 0,641391493887634 )
J Integr Bioinform - Knowledge enrichment analysis for human tissue-specific genes uncover new biological insights. ( 0,641064523499677 )
J Am Med Inform Assoc - Knowledge boosting: a graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction. ( 0,640288382128797 )
Wiley Interdiscip Rev Syst Biol Med - miRNA regulation in the context of functional protein networks: principles and applications. ( 0,639988664808166 )
J Integr Bioinform - Noise tolerance of multiple classifier systems in data integration-based gene function prediction. ( 0,639979773798373 )
Wiley Interdiscip Rev Syst Biol Med - Systems biology of adipose tissue metabolism: regulation of growth, signaling and inflammation. ( 0,639746114021533 )
J Integr Bioinform - Assembling cell context-specific gene sets: a case in cardiomyopathy. ( 0,639071602116013 )
Comput. Biol. Med. - Degrees of separation as a statistical tool for evaluating candidate genes. ( 0,638961347413878 )
Comput. Biol. Med. - Nonlinear dimensionality reduction of gene expression data for visualization and clustering analysis of cancer tissue samples. ( 0,63877579247781 )