J Chem Inf Model - Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules.

Tópicos

{ method(1219) similar(1157) match(930) }
{ method(1969) cluster(1462) data(1082) }
{ learn(2355) train(1041) set(1003) }
{ can(774) often(719) complex(702) }
{ general(901) number(790) one(736) }
{ compound(1573) activ(1297) structur(1058) }
{ method(2212) result(1239) propos(1039) }
{ method(1557) propos(1049) approach(1037) }
{ data(3008) multipl(1320) sourc(1022) }
{ structur(1116) can(940) graph(676) }
{ network(2748) neural(1063) input(814) }
{ method(984) reconstruct(947) comput(926) }
{ research(1085) discuss(1038) issu(1018) }
{ model(2341) predict(2261) use(1141) }
{ sequenc(1873) structur(1644) protein(1328) }
{ featur(3375) classif(2383) classifi(1994) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ perform(999) metric(946) measur(919) }
{ data(2317) use(1299) case(1017) }
{ gene(2352) biolog(1181) express(1162) }
{ process(1125) use(805) approach(778) }
{ detect(2391) sensit(1101) algorithm(908) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ imag(2675) segment(2577) method(1081) }
{ treatment(1704) effect(941) patient(846) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ algorithm(1844) comput(1787) effici(935) }
{ sampl(1606) size(1419) use(1276) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ model(3404) distribut(989) bayesian(671) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ imag(2830) propos(1344) filter(1198) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ extract(1171) text(1153) clinic(932) }
{ data(1714) softwar(1251) tool(1186) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ result(1111) use(1088) new(759) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }

Resumo

Shallow machine learning methods have been applied to chemoinformatics problems with some success. As more data becomes available and more complex problems are tackled, deep machine learning methods may also become useful. Here, we present a brief overview of deep learning methods and show in particular how recursive neural network approaches can be applied to the problem of predicting molecular properties. However, molecules are typically described by undirected cyclic graphs, while recursive approaches typically use directed acyclic graphs. Thus, we develop methods to address this discrepancy, essentially by considering an ensemble of recursive neural networks associated with all possible vertex-centered acyclic orientations of the molecular graph. One advantage of this approach is that it relies only minimally on the identification of suitable molecular descriptors because suitable representations are learned automatically from the data. Several variants of this approach are applied to the problem of predicting aqueous solubility and tested on four benchmark data sets. Experimental results show that the performance of the deep learning methods matches or exceeds the performance of other state-of-the-art methods according to several evaluation metrics and expose the fundamental limitations arising from training sets that are too small or too noisy. A Web-based predictor, AquaSol, is available online through the ChemDB portal ( cdb.ics.uci.edu ) together with additional material.

Resumo Limpo

shallow machin learn method appli chemoinformat problem success data becom avail complex problem tackl deep machin learn method may also becom use present brief overview deep learn method show particular recurs neural network approach can appli problem predict molecular properti howev molecul typic describ undirect cyclic graph recurs approach typic use direct acycl graph thus develop method address discrep essenti consid ensembl recurs neural network associ possibl vertexcent acycl orient molecular graph one advantag approach reli minim identif suitabl molecular descriptor suitabl represent learn automat data sever variant approach appli problem predict aqueous solubl test four benchmark data set experiment result show perform deep learn method match exceed perform stateoftheart method accord sever evalu metric expos fundament limit aris train set small noisi webbas predictor aquasol avail onlin chemdb portal cdbicsuciedu togeth addit materi

Resumos Similares

IEEE Trans Image Process - Flexible Image Similarity Computation Using Hyper-Spatial Matching. ( 0,637068500478151 )
J Chem Inf Model - In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Na?ve Bayes and Parzen-Rosenblatt window. ( 0,634476698182562 )
IEEE Trans Neural Netw Learn Syst - Discriminative embedded clustering: a framework for grouping high-dimensional data. ( 0,629839184564858 )
Comput Methods Programs Biomed - Bagging, bumping, multiview, and active learning for record linkage with empirical results on patient identity data. ( 0,629084034209944 )
J Biomed Inform - Clustering clinical models from local electronic health records based on semantic similarity. ( 0,611641956313405 )
Int J Neural Syst - A genetic graph-based approach for partitional clustering. ( 0,603880619637274 )
J Chem Inf Model - Comparison of combinatorial clustering methods on pharmacological data sets represented by machine learning-selected real molecular descriptors. ( 0,603026181738205 )
Comput. Biol. Med. - A methodology to identify consensus classes from clustering algorithms applied to immunohistochemical data from breast cancer patients. ( 0,597709748238674 )
Artif Intell Med - Vicinal support vector classifier using supervised kernel-based clustering. ( 0,592730354683453 )
J Chem Inf Model - Reading PDB: perception of molecules from 3D atomic coordinates. ( 0,58659178663337 )
IEEE Trans Image Process - Subspaces indexing model on Grassmann manifold for image search. ( 0,584209887818194 )
IEEE Trans Image Process - A uniform grid structure to speed up example-based photometric stereo. ( 0,564473668137277 )
IEEE Trans Image Process - A comparative review of component tree computation algorithms. ( 0,562883294051384 )
AMIA Annu Symp Proc - Dissimilarities in the Logical Modeling of Apparently Similar Concepts in SNOMED CT. ( 0,559878147977026 )
J Chem Inf Model - An unbiased method to build benchmarking sets for ligand-based virtual screening and its application to GPCRs. ( 0,554524004008684 )
Comput. Biol. Med. - Predicting cardiac autonomic neuropathy category for diabetic data with missing values. ( 0,553391784471899 )
J Biomed Inform - Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values. ( 0,552525222051869 )
J Chem Inf Model - Heterogeneous classifier fusion for ligand-based virtual screening: or, how decision making by committee can be a good thing. ( 0,550695353618088 )
Artif Intell Med - Detecting rare events using extreme value statistics applied to epileptic convulsions in children. ( 0,550490272968122 )
Neural Comput - Deep, big, simple neural nets for handwritten digit recognition. ( 0,550344032782574 )
J Chem Inf Model - Exploiting structural information in patent specifications for key compound prediction. ( 0,548532908100616 )
J Chem Inf Model - Build-up algorithm for atomic correspondence between chemical structures. ( 0,544296295989691 )
IEEE Trans Pattern Anal Mach Intell - Learning Nonlinear Functions Using Regularized Greedy Forest. ( 0,544110292676585 )
Artif Intell Med - A classifier ensemble approach for the missing feature problem. ( 0,543599150009948 )
Comput. Biol. Med. - Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches. ( 0,541552631929383 )
IEEE Trans Pattern Anal Mach Intell - Iterative Discovery of Multiple Alternative Clustering Views. ( 0,541389036720908 )
J. Comput. Biol. - A geometric clustering algorithm with applications to structural data. ( 0,537710856751738 )
J Chem Inf Model - Atom environment kernels on molecules. ( 0,533360832629584 )
Comput. Biol. Med. - Identification of epilepsy stages from ECoG using genetic programming classifiers. ( 0,533066527855674 )
IEEE Trans Neural Netw Learn Syst - Fick's Law Assisted Propagation for Semisupervised Learning. ( 0,529377016185835 )
IEEE Trans Pattern Anal Mach Intell - Exemplar-Based Colour Constancy and Multiple Illumination. ( 0,529035912230096 )
IEEE Trans Vis Comput Graph - KelpFusion: a Hybrid Set Visualization Technique. ( 0,528904560367295 )
IEEE Trans Pattern Anal Mach Intell - Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data. ( 0,528700794743172 )
J Chem Inf Model - String kernels and high-quality data set for improved prediction of kinked helices in a-helical membrane proteins. ( 0,525840253738803 )
IEEE Trans Image Process - Fast bilateral filter with arbitrary range and domain kernels. ( 0,525183436612105 )
AMIA Annu Symp Proc - Using String Metrics to Identify Patient Journeys through Care Pathways. ( 0,523379114652573 )
Comput Methods Programs Biomed - Design of fuzzy classifier for diabetes disease using Modified Artificial Bee Colony algorithm. ( 0,52318255198707 )
Artif Intell Med - Feasibility of case-based beam generation for robotic radiosurgery. ( 0,523176216270842 )
IEEE Trans Pattern Anal Mach Intell - A Prototype Learning Framework Using EMD: Application to Complex Scenes Analysis. ( 0,521795573746552 )
IEEE Trans Image Process - General road detection from a single image. ( 0,521347075850329 )
J Chem Inf Model - How different are two chemical structures? ( 0,521068378799201 )
Neural Comput - Spontaneous clustering via minimum -divergence. ( 0,520965696628546 )
Artif Intell Med - Multi-test decision tree and its application to microarray data classification. ( 0,520030559835581 )
Comput. Biol. Med. - Fractal features for localization of temporal lobe epileptic foci using SPECT imaging. ( 0,519962125447421 )
Artif Intell Med - Cost-sensitive case-based reasoning using a genetic algorithm: application to medical diagnosis. ( 0,518560337848519 )
Med Biol Eng Comput - A mathematical method for constraint-based cluster analysis towards optimized constrictive diameter smoothing of saphenous vein grafts. ( 0,51724355734666 )
IEEE Trans Image Process - Data-dependent hashing based on p-stable distribution. ( 0,517238645509988 )
J. Comput. Biol. - Separating significant matches from spurious matches in DNA sequences. ( 0,516915013049389 )
Int J Comput Assist Radiol Surg - Statistical shape model of a liver for autopsy imaging. ( 0,516517702967162 )
J Chem Inf Model - Benchmark data sets for structure-based computational target prediction. ( 0,515496615904352 )
Comput Methods Programs Biomed - Ventricular activity morphological characterization: ectopic beats removal in long term atrial fibrillation recordings. ( 0,515035144656943 )
Comput Biol Chem - Classification of splice-junction sequences via weighted position specific scoring approach. ( 0,514165448034095 )
IEEE Trans Image Process - Multiple-kernel, multiple-instance similarity features for efficient visual object detection. ( 0,51384642668931 )
Comput Biol Chem - Understanding the general packing rearrangements required for successful template based modeling of protein structure from a CASP experiment. ( 0,512183270702614 )
IEEE Trans Image Process - Multiple kernel sparse representations for supervised and unsupervised learning. ( 0,511393412687504 )
Artif Intell Med - Weighted spherical 1-mean with phase shift and its application in electrocardiogram discord detection. ( 0,510493186175583 )
Artif Intell Med - Missing data imputation using statistical and machine learning methods in a real breast cancer problem. ( 0,508723935759582 )
J Biomed Inform - Learning Bayesian networks from survival data using weighting censored instances. ( 0,508609879724742 )
IEEE Trans Image Process - Retina verification system based on biometric graph matching. ( 0,507818708682674 )
J Biomed Inform - Classifying temporal relations in clinical data: a hybrid, knowledge-rich approach. ( 0,506658286633589 )
IEEE Trans Pattern Anal Mach Intell - Semi-Supervised Kernel Mean Shift Clustering. ( 0,504744691819286 )
IEEE Trans Vis Comput Graph - Cylinder Detection in Large-Scale Point Cloud of Pipeline Plant. ( 0,503521144271477 )
J Biomed Inform - Transfer learning of classification rules for biomarker discovery and verification from molecular profiling studies. ( 0,503467952682095 )
Comput Math Methods Med - Multiple suboptimal solutions for prediction rules in gene expression data. ( 0,502661230885333 )
J Chem Inf Model - Virtual drug screen schema based on multiview similarity integration and ranking aggregation. ( 0,502065189177295 )
IEEE Trans Neural Netw Learn Syst - Improved Fault Classification in Series Compensated Transmission Line: Comparative Evaluation of Chebyshev Neural Network Training Algorithms. ( 0,50119838004603 )
Neural Comput - Feature selection for ordinal text classification. ( 0,500047397699516 )
Comput Math Methods Med - Reliable RANSAC using a novel preprocessing model. ( 0,499466191651896 )
AMIA Annu Symp Proc - Using hierarchical mixture of experts model for fusion of outbreak detection methods. ( 0,498978596580265 )
Artif Intell Med - Multi-marker tagging single nucleotide polymorphism selection using estimation of distribution algorithms. ( 0,498048002159543 )
J. Comput. Biol. - Detecting non-uniform clusters in large-scale interaction graphs. ( 0,497557449534927 )
IEEE Trans Vis Comput Graph - Point-Based Visualization for Large Hierarchies. ( 0,497521854621893 )
J Med Syst - Neural network approaches to grade adult depression. ( 0,496769375890928 )
Comput. Biol. Med. - Sparse Manifold Clustering and Embedding to discriminate gene expression profiles of glioblastoma and meningioma tumors. ( 0,496020246857434 )
IEEE Trans Image Process - Correlation-coefficient-based fast template matching through partial elimination. ( 0,49551709602474 )
Artif Intell Med - Automatic classification of epilepsy types using ontology-based and genetics-based machine learning. ( 0,495265155135761 )
Comput Methods Programs Biomed - Generating correlated discrete ordinal data using R and SAS IML. ( 0,495256461106295 )
J Med Syst - Application of attribute weighting method based on clustering centers to discrimination of linearly non-separable medical datasets. ( 0,494395544664647 )
IEEE Trans Image Process - Efficiently learning a detection cascade with sparse eigenvectors. ( 0,493983606489896 )
IEEE Trans Image Process - Robust weighted graph transformation matching for rigid and nonrigid image registration. ( 0,492298184084597 )
J Chem Inf Model - COSMOsim3D: 3D-similarity and alignment based on COSMO polarization charge densities. ( 0,490550614828322 )
IEEE Trans Neural Netw Learn Syst - A Unified Framework for Data Visualization and Coclustering. ( 0,490450576056833 )
IEEE Trans Image Process - Maximum a posteriori video super-resolution using a new multichannel image prior. ( 0,490171036558495 )
Comput Math Methods Med - Iterative methods for obtaining energy-minimizing parametric snakes with applications to medical imaging. ( 0,489148157529719 )
BMC Med Inform Decis Mak - Efficient algorithms for fast integration on large data sets from multiple sources. ( 0,489116316365259 )
J Biomed Inform - Complementary ensemble clustering of biomedical data. ( 0,488570285259378 )
Neural Comput - Suitability of V1 energy models for object classification. ( 0,488383684594581 )
IEEE Trans Image Process - Interval-valued fuzzy sets applied to stereo matching of color images. ( 0,486625475576172 )
Artif Intell Med - An implicit approach to deal with periodically repeated medical data. ( 0,486107074545501 )
J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE. ( 0,485807981856969 )
J Chem Inf Model - Investigation of the use of spectral clustering for the analysis of molecular data. ( 0,485779506060702 )
J Biomed Inform - Identifying well-formed biomedical phrases in MEDLINE? text. ( 0,485340121296589 )
IEEE Trans Pattern Anal Mach Intell - A Link-Based Approach to the Cluster Ensemble Problem. ( 0,484892719116637 )
Comput Math Methods Med - Correlation kernels for support vector machines classification with applications in cancer data. ( 0,484873492863255 )
Int J Health Geogr - A binary-based approach for detecting irregularly shaped clusters. ( 0,484551824575864 )
J Integr Bioinform - An evolutionary and visual framework for clustering of DNA microarray data. ( 0,484350234695295 )
J Chem Inf Model - Activity-aware clustering of high throughput screening data and elucidation of orthogonal structure-activity relationships. ( 0,483721676678823 )
Comput Methods Programs Biomed - Multiscaled combination of MR and SPECT images in neuroimaging: A simplex method based variable-weight fusion. ( 0,483242851776885 )
IEEE Trans Image Process - Linear discriminant analysis based on L1-norm maximization. ( 0,483241306999263 )
Comput Math Methods Med - A robust rerank approach for feature selection and its application to pooling-based GWA studies. ( 0,483195938079773 )