J Am Med Inform Assoc - Learning regular expressions for clinical text classification.

Tópicos

{ featur(3375) classif(2383) classifi(1994) }
{ extract(1171) text(1153) clinic(932) }
{ perform(999) metric(946) measur(919) }
{ sequenc(1873) structur(1644) protein(1328) }
{ result(1111) use(1088) new(759) }
{ problem(2511) optim(1539) algorithm(950) }
{ learn(2355) train(1041) set(1003) }
{ process(1125) use(805) approach(778) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ assess(1506) score(1403) qualiti(1306) }
{ concept(1167) ontolog(924) domain(897) }
{ design(1359) user(1324) use(1319) }
{ data(3963) clinic(1234) research(1004) }
{ studi(1119) effect(1106) posit(819) }
{ cost(1906) reduc(1198) effect(832) }
{ bind(1733) structur(1185) ligand(1036) }
{ method(1219) similar(1157) match(930) }
{ chang(1828) time(1643) increas(1301) }
{ control(1307) perform(991) simul(935) }
{ method(984) reconstruct(947) comput(926) }
{ compound(1573) activ(1297) structur(1058) }
{ group(2977) signific(1463) compar(1072) }
{ sampl(1606) size(1419) use(1276) }
{ use(1733) differ(960) four(931) }
{ implement(1333) system(1263) develop(1122) }
{ detect(2391) sensit(1101) algorithm(908) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ error(1145) method(1030) estim(1020) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ data(1714) softwar(1251) tool(1186) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ search(2224) databas(1162) retriev(909) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ research(1085) discuss(1038) issu(1018) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ medic(1828) order(1363) alert(1069) }
{ signal(2180) analysi(812) frequenc(800) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(976) code(926) identifi(902) }
{ drug(1928) target(777) effect(648) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }

Resumo

JECTIVES: Natural language processing (NLP) applications typically use regular expressions that have been developed manually by human experts. Our goal is to automate both the creation and utilization of regular expressions in text classification.METHODS: We designed a novel regular expression discovery (RED) algorithm and implemented two text classifiers based on RED. The RED+ALIGN classifier combines RED with an alignment algorithm, and RED+SVM combines RED with a support vector machine (SVM) classifier. Two clinical datasets were used for testing and evaluation: the SMOKE dataset, containing 1091 text snippets describing smoking status; and the PAIN dataset, containing 702 snippets describing pain status. We performed 10-fold cross-validation to calculate accuracy, precision, recall, and F-measure metrics. In the evaluation, an SVM classifier was trained as the control.RESULTS: The two RED classifiers achieved 80.9-83.0% in overall accuracy on the two datasets, which is 1.3-3% higher than SVM's accuracy (p<0.001). Similarly, small but consistent improvements have been observed in precision, recall, and F-measure when RED classifiers are compared with SVM alone. More significantly, RED+ALIGN correctly classified many instances that were misclassified by the SVM classifier (8.1-10.3% of the total instances and 43.8-53.0% of SVM's misclassifications).CONCLUSIONS: Machine-generated regular expressions can be effectively used in clinical text classification. The regular expression-based classifier can be combined with other classifiers, like SVM, to improve classification performance.

Resumo Limpo

jectiv natur languag process nlp applic typic use regular express develop manual human expert goal autom creation util regular express text classificationmethod design novel regular express discoveri red algorithm implement two text classifi base red redalign classifi combin red align algorithm redsvm combin red support vector machin svm classifi two clinic dataset use test evalu smoke dataset contain text snippet describ smoke status pain dataset contain snippet describ pain status perform fold crossvalid calcul accuraci precis recal fmeasur metric evalu svm classifi train controlresult two red classifi achiev overal accuraci two dataset higher svms accuraci p similar small consist improv observ precis recal fmeasur red classifi compar svm alon signific redalign correct classifi mani instanc misclassifi svm classifi total instanc svms misclassificationsconclus machinegener regular express can effect use clinic text classif regular expressionbas classifi can combin classifi like svm improv classif perform

Resumos Similares

Comput Biol Chem - Information-theoretic approaches to SVM feature selection for metagenome read classification. ( 0,8397510058164 )
J Biomed Inform - Automatic figure classification in bioscience literature. ( 0,836553025590064 )
AMIA Annu Symp Proc - Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing. ( 0,824477722061327 )
J Med Syst - Enhanced cancer recognition system based on random forests feature elimination algorithm. ( 0,807765564687746 )
J Biomed Inform - A biological continuum based approach for efficient clinical classification. ( 0,784031320992674 )
Comput. Biol. Med. - An ensemble system for automatic sleep stage classification using single channel EEG signal. ( 0,781242835325487 )
Artif Intell Med - Texture feature ranking with relevance learning to classify interstitial lung disease patterns. ( 0,781078633285147 )
Comput Math Methods Med - Comparison of different EHG feature selection methods for the detection of preterm labor. ( 0,776192345649466 )
Artif Intell Med - Document classification for mining host pathogen protein-protein interactions. ( 0,775426820283273 )
J Med Syst - SVM feature selection based rotation forest ensemble classifiers to improve computer-aided diagnosis of Parkinson disease. ( 0,767655402724556 )
Comput Biol Chem - A novel divide-and-merge classification for high dimensional datasets. ( 0,767505663793801 )
J Am Med Inform Assoc - Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. ( 0,766908352141786 )
J Med Syst - A robust multi-class feature selection strategy based on Rotation Forest Ensemble algorithm for diagnosis of Erythemato-Squamous diseases. ( 0,765013715904686 )
J Am Med Inform Assoc - Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers. ( 0,764377516892489 )
Comput Math Methods Med - SVM versus MAP on accelerometer data to distinguish among locomotor activities executed at different speeds. ( 0,761281416129229 )
J Med Syst - Classification of speech dysfluencies using LPC based parameterization techniques. ( 0,760599392197514 )
J Med Syst - An intelligent system for lung cancer diagnosis using a new genetic algorithm based feature selection method. ( 0,755819858785049 )
J Am Med Inform Assoc - A system for coreference resolution for the clinical narrative. ( 0,753652748556038 )
Comput. Biol. Med. - Pairwise FCM based feature weighting for improved classification of vertebral column disorders. ( 0,751346373761496 )
J Chem Inf Model - Large-scale learning of structure-activity relationships using a linear support vector machine and problem-specific metrics. ( 0,750631638007543 )
J Med Syst - A new approach for concealed information identification based on ERP assessment. ( 0,75031640460326 )
Comput. Biol. Med. - Fast and efficient lung disease classification using hierarchical one-against-all support vector machine and cost-sensitive feature selection. ( 0,750184912597935 )
AMIA Annu Symp Proc - Automatic identification of critical follow-up recommendation sentences in radiology reports. ( 0,74973497662908 )
Comput. Biol. Med. - Contourlet-based mammography mass classification using the SVM family. ( 0,746911224591573 )
IEEE Trans Image Process - A unified feature and instance selection framework using optimum experimental design. ( 0,740951018700105 )
Comput Biol Chem - Derivation of an artificial gene to improve classification accuracy upon gene selection. ( 0,740476946722806 )
Comput. Biol. Med. - A new dataset evaluation method based on category overlap. ( 0,739422518511017 )
Comput Biol Chem - Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm. ( 0,73699412409409 )
Comput Methods Programs Biomed - Automatic cervical cell segmentation and classification in Pap smears. ( 0,736658956635493 )
J Med Syst - A comparative study on classification of sleep stage based on EEG signals using feature selection and classification algorithms. ( 0,735735467592358 )
J Am Med Inform Assoc - N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit. ( 0,734580938192509 )
Comput Math Methods Med - Discrimination between Alzheimer's disease and mild cognitive impairment using SOM and PSO-SVM. ( 0,734395872956679 )
Comput. Biol. Med. - Disulfide connectivity prediction based on structural information without a prior knowledge of the bonding state of cysteines. ( 0,732561912271286 )
J Biomed Inform - Boosting performance of gene mention tagging system by hybrid methods. ( 0,732276253068762 )
J Am Med Inform Assoc - A comprehensive study of named entity recognition in Chinese clinical text. ( 0,731995191709148 )
Comput. Biol. Med. - A novel class dependent feature selection method for cancer biomarker discovery. ( 0,731025001276641 )
J Biomed Inform - A fast gene selection method for multi-cancer classification using multiple support vector data description. ( 0,728279082941098 )
Comput. Biol. Med. - SVM-based feature selection to optimize sensitivity-specificity balance applied to weaning. ( 0,72799222375031 )
Int J Comput Assist Radiol Surg - Building an ensemble system for diagnosing masses in mammograms. ( 0,727654613836278 )
Int J Neural Syst - Single-trial motor imagery classification using asymmetry ratio, phase relation, wavelet-based fractal, and their selected combination. ( 0,726033187032742 )
Comput. Biol. Med. - On the relevance of automatically selected single-voxel MRS and multimodal MRI and MRSI features for brain tumour differentiation. ( 0,725432035750488 )
Comput. Biol. Med. - A classification system based on a new wrapper feature selection algorithm for the diagnosis of primary and secondary polycythemia. ( 0,725216290204131 )
Comput. Biol. Med. - Heartbeat classification using disease-specific feature selection. ( 0,725190655400184 )
Comput Methods Programs Biomed - A new hybrid intelligent system for accurate detection of Parkinson's disease. ( 0,723934839680308 )
J Am Med Inform Assoc - Practical implementation of an existing smoking detection pipeline and reduced support vector machine training corpus requirements. ( 0,723337445018256 )
Artif Intell Med - Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction. ( 0,722382196261664 )
J Am Med Inform Assoc - Using statistical text classification to identify health information technology incidents. ( 0,720163632169625 )
Comput. Biol. Med. - Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders. ( 0,718048487926959 )
AMIA Annu Symp Proc - Identifying discourse connectives in biomedical text. ( 0,718039419699003 )
J Am Med Inform Assoc - A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets. ( 0,717204026769362 )
IEEE J Biomed Health Inform - Recognizing common CT imaging signs of lung diseases through a new feature selection method based on Fisher criterion and genetic optimization. ( 0,715705731437112 )
Int J Comput Assist Radiol Surg - Multimodality GPU-based computer-assisted diagnosis of breast cancer using ultrasound and digital mammography images. ( 0,714875501019952 )
J Med Syst - Luminance sticker based facial expression recognition using discrete wavelet transform for physically disabled persons. ( 0,713063210454727 )
Comput Biol Chem - CE-PLoc: an ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition. ( 0,711068364454445 )
Comput Methods Programs Biomed - A random forest classifier for lymph diseases. ( 0,709525750961344 )
J Digit Imaging - Computer-aided diagnosis of malignant mammograms using Zernike moments and SVM. ( 0,708655895091666 )
Artif Intell Med - Selective voting in convex-hull ensembles improves classification accuracy. ( 0,70676434644533 )
Artif Intell Med - An intelligent classifier for prognosis of cardiac resynchronization therapy based on speckle-tracking echocardiograms. ( 0,705914674751434 )
Int J Neural Syst - Extraction of neural control commands using myoelectric pattern recognition: a novel application in adults with cerebral palsy. ( 0,705613535504707 )
IEEE Trans Image Process - A novel technique for subpixel image classification based on support vector machine. ( 0,701457978758507 )
Comput Math Methods Med - An ensemble-of-classifiers based approach for early diagnosis of Alzheimer's disease: classification using structural features of brain images. ( 0,700811523902019 )
Artif Intell Med - Classification of small lesions on dynamic breast MRI: Integrating dimension reduction and out-of-sample extension into CADx methodology. ( 0,700357324034734 )
Comput. Biol. Med. - Ensemble selection for feature-based classification of diabetic maculopathy images. ( 0,699124702329023 )
Brief. Bioinformatics - Class-imbalanced classifiers for high-dimensional data. ( 0,697724710301611 )
J Am Med Inform Assoc - Pneumonia identification using statistical feature selection. ( 0,697645547961074 )
J Med Syst - A three-stage expert system based on support vector machines for thyroid disease diagnosis. ( 0,697092768523647 )
Comput Methods Programs Biomed - Computer-supported diagnosis for endotension cases in endovascular aortic aneurysm repair evolution. ( 0,696761580754399 )
J Am Med Inform Assoc - Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. ( 0,694689409460538 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,691474758380358 )
J Biomed Inform - Dynamic categorization of clinical research eligibility criteria by hierarchical clustering. ( 0,691169343378125 )
Comput Methods Programs Biomed - Computer-aided diagnosis system: a Bayesian hybrid classification method. ( 0,690070889248081 )
J Med Syst - A new expert system for diagnosis of lung cancer: GDA-LS_SVM. ( 0,68878494249114 )
Comput. Biol. Med. - Gene expression microarray classification using PCA-BEL. ( 0,688154093835483 )
Med Biol Eng Comput - Evaluation of feature extraction methods for EEG-based brain-computer interfaces in terms of robustness to slight changes in electrode locations. ( 0,686252474618291 )
J Med Syst - Detection of carotid artery disease by using Learning Vector Quantization Neural Network. ( 0,686127459323228 )
Comput. Biol. Med. - A hybrid feature selection method for DNA microarray data. ( 0,684963064257411 )
J Biomed Inform - Building an automated SOAP classifier for emergency department reports. ( 0,684529455323324 )
J Med Syst - Automated diagnosis of Alzheimer disease using the scale-invariant feature transforms in magnetic resonance images. ( 0,683645766944588 )
Int J Neural Syst - Combination of heterogeneous EEG feature extraction methods and stacked sequential learning for sleep stage classification. ( 0,683169674305206 )
Comput Methods Programs Biomed - Understanding symptomatology of atherosclerotic plaque by image-based tissue characterization. ( 0,682736279239188 )
Artif Intell Med - Improving the accuracy of suicide attempter classification. ( 0,681629650565827 )
J Biomed Inform - Ontology-guided feature engineering for clinical text classification. ( 0,679724176156703 )
Med Biol Eng Comput - Wavelet-based sparse functional linear model with applications to EEGs seizure detection and epilepsy diagnosis. ( 0,677201791306236 )
J Chem Inf Model - Classifying molecules using a sparse probabilistic kernel binary classifier. ( 0,676216962441803 )
Artif Intell Med - Selection of effective features for ECG beat recognition based on nonlinear correlations. ( 0,675561406162102 )
IEEE Trans Image Process - Human detection in images via piecewise linear support vector machines. ( 0,674815005187971 )
Comput. Biol. Med. - Computer-aided diagnosis system for the Acute Respiratory Distress Syndrome from chest radiographs. ( 0,670340985956734 )
BMC Med Inform Decis Mak - Recognition of medication information from discharge summaries using ensembles of classifiers. ( 0,668948652724646 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,668589940835443 )
Comput Math Methods Med - Mixed-norm regularization for brain decoding. ( 0,668466832498153 )
J Am Med Inform Assoc - A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction. ( 0,668066165767593 )
Comput. Biol. Med. - Ant colony optimization-based feature selection method for surface electromyography signals classification. ( 0,667657789031633 )
J Med Syst - Statistical analysis of textural features for improved classification of oral histopathological images. ( 0,664795247633003 )
J Med Syst - Similarity-dissimilarity plot for visualization of high dimensional data in biomedical pattern classification. ( 0,663828441998012 )
Comput. Biol. Med. - Using machine learning techniques and genomic/proteomic information from known databases for defining relevant features for PPI classification. ( 0,662513298867882 )
J Integr Bioinform - On the parameter optimization of Support Vector Machines for binary classification. ( 0,66198831243958 )
Comput Biol Chem - Multi objective SNP selection using pareto optimality. ( 0,661731571989296 )
IEEE J Biomed Health Inform - Automatic detection of atrial fibrillation in cardiac vibration signals. ( 0,661233806621105 )
Comput Methods Programs Biomed - An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms. ( 0,660371252745515 )
Neural Comput - An Infomax algorithm can perform both familiarity discrimination and feature extraction in a single network. ( 0,660371252745515 )