Artif Intell Med - Improving the Mann-Whitney statistical test for feature selection: an approach in breast cancer diagnosis on mammography.

Tópicos

{ featur(3375) classif(2383) classifi(1994) }
{ method(2212) result(1239) propos(1039) }
{ perform(999) metric(946) measur(919) }
{ result(1111) use(1088) new(759) }
{ learn(2355) train(1041) set(1003) }
{ cancer(2502) breast(956) screen(824) }
{ system(1976) rule(880) can(841) }
{ cost(1906) reduc(1198) effect(832) }
{ estim(2440) model(1874) function(577) }
{ assess(1506) score(1403) qualiti(1306) }
{ problem(2511) optim(1539) algorithm(950) }
{ design(1359) user(1324) use(1319) }
{ studi(1410) differ(1259) use(1210) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ method(1969) cluster(1462) data(1082) }
{ network(2748) neural(1063) input(814) }
{ system(1050) medic(1026) inform(1018) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ first(2504) two(1366) second(1323) }
{ use(2086) technolog(871) perceiv(783) }
{ drug(1928) target(777) effect(648) }
{ decis(3086) make(1611) patient(1517) }
{ data(1737) use(1416) pattern(1282) }
{ measur(2081) correl(1212) valu(896) }
{ treatment(1704) effect(941) patient(846) }
{ care(1570) inform(1187) nurs(1089) }
{ howev(809) still(633) remain(590) }
{ studi(1119) effect(1106) posit(819) }
{ patient(2837) hospit(1953) medic(668) }
{ age(1611) year(1155) adult(843) }
{ patient(1821) servic(1111) care(1106) }
{ analysi(2126) use(1163) compon(1037) }
{ high(1669) rate(1365) level(1280) }
{ implement(1333) system(1263) develop(1122) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ inform(2794) health(2639) internet(1427) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ data(3963) clinic(1234) research(1004) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ import(1318) role(1303) understand(862) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ model(2656) set(1616) predict(1553) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ can(981) present(881) function(850) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ use(976) code(926) identifi(902) }
{ use(1733) differ(960) four(931) }
{ survey(1388) particip(1329) question(1065) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ detect(2391) sensit(1101) algorithm(908) }

Resumo

JECTIVE: This work addresses the theoretical description and experimental evaluation of a new feature selection method (named uFilter). The uFilter improves the Mann-Whitney U-test for reducing dimensionality and ranking features in binary classification problems. Also, it presented a practical uFilter application on breast cancer computer-aided diagnosis (CADx).MATERIALS AND METHODS: A total of 720 datasets (ranked subsets of features) were formed by the application of the chi-square (CHI2) discretization, information-gain (IG), one-rule (1Rule), Relief, uFilter and its theoretical basis method (named U-test). Each produced dataset was used for training feed-forward backpropagation neural network, support vector machine, linear discriminant analysis and naive Bayes machine learning algorithms to produce classification scores for further statistical comparisons.RESULTS: A head-to-head comparison based on the mean of area under receiver operating characteristics curve scores against the U-test method showed that the uFilter method significantly outperformed the U-test method for almost all classification schemes (p<0.05); it was superior in 50%; tied in a 37.5% and lost in a 12.5% of the 24 comparative scenarios. Also, the performance of the uFilter method, when compared with CHI2 discretization, IG, 1Rule and Relief methods, was superior or at least statistically similar on the explored datasets while requiring less number of features.CONCLUSIONS: The experimental results indicated that uFilter method statistically outperformed the U-test method and it demonstrated similar, but not superior, performance than traditional feature selection methods (CHI2 discretization, IG, 1Rule and Relief). The uFilter method revealed competitive and appealing cost-effectiveness results on selecting relevant features, as a support tool for breast cancer CADx methods especially in unbalanced datasets contexts. Finally, the redundancy analysis as a complementary step to the uFilter method provided us an effective way for finding optimal subsets of features without decreasing the classification performances.

Resumo Limpo

jectiv work address theoret descript experiment evalu new featur select method name ufilt ufilt improv mannwhitney utest reduc dimension rank featur binari classif problem also present practic ufilt applic breast cancer computeraid diagnosi cadxmateri method total dataset rank subset featur form applic chisquar chi discret informationgain ig onerul rule relief ufilt theoret basi method name utest produc dataset use train feedforward backpropag neural network support vector machin linear discrimin analysi naiv bay machin learn algorithm produc classif score statist comparisonsresult headtohead comparison base mean area receiv oper characterist curv score utest method show ufilt method signific outperform utest method almost classif scheme p superior tie lost compar scenario also perform ufilt method compar chi discret ig rule relief method superior least statist similar explor dataset requir less number featuresconclus experiment result indic ufilt method statist outperform utest method demonstr similar superior perform tradit featur select method chi discret ig rule relief ufilt method reveal competit appeal costeffect result select relev featur support tool breast cancer cadx method especi unbalanc dataset context final redund analysi complementari step ufilt method provid us effect way find optim subset featur without decreas classif perform

Resumos Similares

Comput. Biol. Med. - A DIAMOND method of inducing classification rules for biological data. ( 0,781240636943407 )
IEEE Trans Neural Netw Learn Syst - FREL: A Stable Feature Selection Algorithm. ( 0,770033709320766 )
Comput Biol Chem - newDNA-Prot: Prediction of DNA-binding proteins by employing support vector machine and a comprehensive sequence representation. ( 0,750615589493583 )
Comput Methods Programs Biomed - Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis. ( 0,741032799387565 )
J Med Syst - Support vector machine based diagnostic system for breast cancer using swarm intelligence. ( 0,740218518603747 )
Comput. Biol. Med. - Ensemble classification of colon biopsy images based on information rich hybrid features. ( 0,736680627205427 )
J Med Syst - A three-stage expert system based on support vector machines for thyroid disease diagnosis. ( 0,735675476961036 )
J Biomed Inform - Boosting performance of gene mention tagging system by hybrid methods. ( 0,735005520416356 )
Comput Methods Programs Biomed - Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines. ( 0,733654606225236 )
J Am Med Inform Assoc - A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction. ( 0,731560018440818 )
Comput Biol Chem - An improved poly(A) motifs recognition method based on decision level fusion. ( 0,724933451909363 )
Comput Methods Programs Biomed - A new hybrid intelligent system for accurate detection of Parkinson's disease. ( 0,707404958159383 )
Comput. Biol. Med. - Disulfide connectivity prediction based on structural information without a prior knowledge of the bonding state of cysteines. ( 0,706992216149419 )
Comput. Biol. Med. - A classification system based on a new wrapper feature selection algorithm for the diagnosis of primary and secondary polycythemia. ( 0,702248202771767 )
Comput. Biol. Med. - A novel class dependent feature selection method for cancer biomarker discovery. ( 0,701548264434839 )
J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE. ( 0,696703553751771 )
Comput Methods Programs Biomed - Wrapper feature selection for small sample size data driven by complete error estimates. ( 0,696468915786761 )
Comput. Biol. Med. - Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders. ( 0,694826846172451 )
Comput Methods Programs Biomed - Performance comparison of machine learning methods for prognosis of hormone receptor status in breast cancer tissue samples. ( 0,689609984548564 )
J Med Syst - An intelligent system for lung cancer diagnosis using a new genetic algorithm based feature selection method. ( 0,688789164495759 )
J Biomed Inform - A genetic algorithm-support vector machine method with parameter optimization for selecting the tag SNPs. ( 0,688114884547767 )
J Biomed Inform - A medical diagnostic tool based on radial basis function classifiers and evolutionary simulated annealing. ( 0,685014367032192 )
J Med Syst - A comparative study on classification of sleep stage based on EEG signals using feature selection and classification algorithms. ( 0,683036706208088 )
Comput Math Methods Med - Knee joint vibration signal analysis with matching pursuit decomposition and dynamic weighted classifier fusion. ( 0,68218100824076 )
Artif Intell Med - Texture feature ranking with relevance learning to classify interstitial lung disease patterns. ( 0,6774151290487 )
Comput. Biol. Med. - An ensemble system for automatic sleep stage classification using single channel EEG signal. ( 0,67658898868626 )
IEEE Trans Image Process - A novel classification method of halftone image via statistics matrices. ( 0,670051842943368 )
Comput. Biol. Med. - Extracting predictive SNPs in Crohn's disease using a vacillating genetic algorithm and a neural classifier in case-control association studies. ( 0,669947661433869 )
Int J Comput Assist Radiol Surg - Image feature evaluation in two new mammography CAD prototypes. ( 0,669371310226144 )
J Med Syst - Usage of case-based reasoning, neural network and adaptive neuro-fuzzy inference system classification techniques in breast cancer dataset classification diagnosis. ( 0,669343231825758 )
J Med Syst - Application of higher order spectra to identify epileptic EEG. ( 0,667261288269674 )
Comput Methods Programs Biomed - Computer-supported diagnosis for endotension cases in endovascular aortic aneurysm repair evolution. ( 0,663866240073746 )
IEEE J Biomed Health Inform - Extracting and Selecting Distinctive EEG Features for Efficient Epileptic Seizure Prediction. ( 0,663651244014809 )
Artif Intell Med - Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction. ( 0,661681604228973 )
Comput Biol Chem - Derivation of an artificial gene to improve classification accuracy upon gene selection. ( 0,661335205016083 )
Comput Methods Programs Biomed - Hepatitis disease diagnosis using a novel hybrid method based on support vector machine and simulated annealing (SVM-SA). ( 0,660004535981047 )
Comput Math Methods Med - Discrimination between Alzheimer's disease and mild cognitive impairment using SOM and PSO-SVM. ( 0,657469901392649 )
J Biomed Inform - A biological continuum based approach for efficient clinical classification. ( 0,656543906009816 )
Neural Comput - An Infomax algorithm can perform both familiarity discrimination and feature extraction in a single network. ( 0,656142680750221 )
J Integr Bioinform - Modelling proteolytic enzymes with Support Vector Machines. ( 0,655855974146405 )
Comput. Biol. Med. - Contourlet-based mammography mass classification using the SVM family. ( 0,655501079649726 )
Comput Math Methods Med - Determination of fetal state from cardiotocogram using LS-SVM with particle swarm optimization and binary decision tree. ( 0,655161594064866 )
Artif Intell Med - Classification of small lesions on dynamic breast MRI: Integrating dimension reduction and out-of-sample extension into CADx methodology. ( 0,654834656152017 )
Comput Biol Chem - A novel divide-and-merge classification for high dimensional datasets. ( 0,654756661000492 )
J Med Syst - SVM feature selection based rotation forest ensemble classifiers to improve computer-aided diagnosis of Parkinson disease. ( 0,653086756610329 )
J Am Med Inform Assoc - A flexible framework for deriving assertions from electronic medical records. ( 0,652404241786872 )
Artif Intell Med - A modified artificial immune system based pattern recognition approach--an application to clinical diagnostics. ( 0,651334332690253 )
Comput. Biol. Med. - Pairwise FCM based feature weighting for improved classification of vertebral column disorders. ( 0,649527037631503 )
Comput. Biol. Med. - Prediction of pre-miRNA with multiple stem-loops using pruning algorithm. ( 0,648348411301447 )
J Med Syst - Diagnosing breast masses in digital mammography using feature selection and ensemble methods. ( 0,644534979164448 )
Comput Methods Programs Biomed - A hybrid system based on information gain and principal component analysis for the classification of transcranial Doppler signals. ( 0,644231096571726 )
Comput. Biol. Med. - Using machine learning techniques and genomic/proteomic information from known databases for defining relevant features for PPI classification. ( 0,643499806281724 )
J Med Syst - Symptomatic vs. asymptomatic plaque classification in carotid ultrasound. ( 0,643263158653671 )
J Med Syst - A robust multi-class feature selection strategy based on Rotation Forest Ensemble algorithm for diagnosis of Erythemato-Squamous diseases. ( 0,643143592751345 )
Comput Methods Programs Biomed - Automatic cervical cell segmentation and classification in Pap smears. ( 0,642828832729667 )
J Chem Inf Model - Classifier ensemble based on feature selection and diversity measures for predicting the affinity of A(2B) adenosine receptor antagonists. ( 0,639144806798549 )
IEEE Trans Image Process - A novel technique for subpixel image classification based on support vector machine. ( 0,638249734320238 )
IEEE J Biomed Health Inform - Multiple kernel learning in the primal for multimodal Alzheimer's disease classification. ( 0,634973858196474 )
Med Biol Eng Comput - Classification of multichannel EEG patterns using parallel hidden Markov models. ( 0,632199375823537 )
Comput Biol Chem - CE-PLoc: an ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition. ( 0,632055664791659 )
Comput. Biol. Med. - Heartbeat classification using disease-specific feature selection. ( 0,630677143522233 )
IEEE J Biomed Health Inform - Recognizing common CT imaging signs of lung diseases through a new feature selection method based on Fisher criterion and genetic optimization. ( 0,630372844264209 )
Comput Biol Chem - Information-theoretic approaches to SVM feature selection for metagenome read classification. ( 0,630329541066524 )
J Med Syst - Enhanced cancer recognition system based on random forests feature elimination algorithm. ( 0,628777209981459 )
Comput Methods Programs Biomed - Classification of normal and epileptic seizure EEG signals using wavelet transform, phase-space reconstruction, and Euclidean distance. ( 0,628690342081169 )
Comput Methods Programs Biomed - An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms. ( 0,626850596788158 )
Comput Math Methods Med - Construction of classifier based on MPCA and QSA and its application on classification of pancreatic diseases. ( 0,626836250795396 )
BMC Med Inform Decis Mak - Efficient techniques for genotype-phenotype correlational analysis. ( 0,626385960480043 )
J Med Syst - A new method based for diagnosis of breast cancer cells from microscopic images: DWEE--JHT. ( 0,625225978148021 )
Comput. Biol. Med. - FRAN and RBF-PSO as two components of a hyper framework to recognize protein folds. ( 0,625169660230244 )
Med Biol Eng Comput - SEMG-based hand motion recognition using cumulative residual entropy and extreme learning machine. ( 0,623811743412302 )
J Med Syst - Detection of carotid artery disease by using Learning Vector Quantization Neural Network. ( 0,623386746416301 )
Int J Comput Assist Radiol Surg - Multimodality GPU-based computer-assisted diagnosis of breast cancer using ultrasound and digital mammography images. ( 0,621928486810628 )
J Biomed Inform - Automatic figure classification in bioscience literature. ( 0,621196958824595 )
Artif Intell Med - White box radial basis function classifiers with component selection for clinical prediction models. ( 0,618608390411233 )
Comput Math Methods Med - SVM versus MAP on accelerometer data to distinguish among locomotor activities executed at different speeds. ( 0,618423482802735 )
J Med Syst - Statistical analysis of textural features for improved classification of oral histopathological images. ( 0,616091915637501 )
Comput. Biol. Med. - Gene expression microarray classification using PCA-BEL. ( 0,615755076508178 )
Int J Neural Syst - Single-trial motor imagery classification using asymmetry ratio, phase relation, wavelet-based fractal, and their selected combination. ( 0,613525897700595 )
Comput Biol Chem - Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm. ( 0,612349071708785 )
Comput Biol Chem - Prediction of protein modification sites of gamma-carboxylation using position specific scoring matrices based evolutionary information. ( 0,611737508644811 )
IEEE J Biomed Health Inform - Classification of color images of dermatological ulcers. ( 0,611568121933745 )
Artif Intell Med - Selective voting in convex-hull ensembles improves classification accuracy. ( 0,610070704888892 )
Comput Math Methods Med - Comparison of different EHG feature selection methods for the detection of preterm labor. ( 0,608798308688615 )
Med Biol Eng Comput - CFS-SMO based classification of breast density using multiple texture models. ( 0,60835479653431 )
Comput. Biol. Med. - A statistical based feature extraction method for breast cancer diagnosis in digital mammogram using multiresolution representation. ( 0,608228696549078 )
IEEE Trans Image Process - Joint framework for motion validity and estimation using block overlap. ( 0,607685651850617 )
J Med Syst - Classification of normal and diseased liver shapes based on Spherical Harmonics coefficients. ( 0,606714611920841 )
IEEE J Biomed Health Inform - Automatic classification of intracardiac tumor and thrombi in echocardiography based on sparse representation. ( 0,6060803236229 )
Int J Neural Syst - Extraction of neural control commands using myoelectric pattern recognition: a novel application in adults with cerebral palsy. ( 0,605782936420573 )
Int J Comput Assist Radiol Surg - Building an ensemble system for diagnosing masses in mammograms. ( 0,6056349459219 )
Sci Data - Scrutinizing the datasets obtained from nanoscale features of spider silk fibres. ( 0,605450537875262 )
Brief. Bioinformatics - Class-imbalanced classifiers for high-dimensional data. ( 0,603765086632785 )
Comput. Biol. Med. - Fast and efficient lung disease classification using hierarchical one-against-all support vector machine and cost-sensitive feature selection. ( 0,603688650662302 )
Comput. Biol. Med. - A new dataset evaluation method based on category overlap. ( 0,603312305962824 )
Comput Methods Programs Biomed - A random forest classifier for lymph diseases. ( 0,601564334123328 )
J Am Med Inform Assoc - A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets. ( 0,600387444194348 )
Comput. Biol. Med. - Retinal vessel extraction using Lattice Neural Networks with Dendritic Processing. ( 0,600302154368237 )
Comput. Biol. Med. - Neurocognitive disorder detection based on feature vectors extracted from VBM analysis of structural MRI. ( 0,599168438239528 )
Artif Intell Med - Electrocardiogram analysis using a combination of statistical, geometric, and nonlinear heart rate variability features. ( 0,598738272229861 )