J Am Med Inform Assoc - Practical implementation of an existing smoking detection pipeline and reduced support vector machine training corpus requirements.

Tópicos

{ featur(3375) classif(2383) classifi(1994) }
{ extract(1171) text(1153) clinic(932) }
{ cost(1906) reduc(1198) effect(832) }
{ learn(2355) train(1041) set(1003) }
{ group(2977) signific(1463) compar(1072) }
{ detect(2391) sensit(1101) algorithm(908) }
{ perform(1367) use(1326) method(1137) }
{ measur(2081) correl(1212) valu(896) }
{ featur(1941) imag(1645) propos(1176) }
{ studi(1410) differ(1259) use(1210) }
{ research(1085) discuss(1038) issu(1018) }
{ ehr(2073) health(1662) electron(1139) }
{ sampl(1606) size(1419) use(1276) }
{ patient(1821) servic(1111) care(1106) }
{ imag(1057) registr(996) error(939) }
{ error(1145) method(1030) estim(1020) }
{ chang(1828) time(1643) increas(1301) }
{ data(1714) softwar(1251) tool(1186) }
{ howev(809) still(633) remain(590) }
{ data(3963) clinic(1234) research(1004) }
{ perform(999) metric(946) measur(919) }
{ studi(1119) effect(1106) posit(819) }
{ data(2317) use(1299) case(1017) }
{ medic(1828) order(1363) alert(1069) }
{ structur(1116) can(940) graph(676) }
{ use(976) code(926) identifi(902) }
{ implement(1333) system(1263) develop(1122) }
{ process(1125) use(805) approach(778) }
{ model(3404) distribut(989) bayesian(671) }
{ can(774) often(719) complex(702) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ system(1976) rule(880) can(841) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ method(1219) similar(1157) match(930) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ concept(1167) ontolog(924) domain(897) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ risk(3053) factor(974) diseas(938) }
{ system(1050) medic(1026) inform(1018) }
{ import(1318) role(1303) understand(862) }
{ model(2341) predict(2261) use(1141) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ blood(1257) pressur(1144) flow(957) }
{ spatial(1525) area(1432) region(1030) }
{ record(1888) medic(1808) patient(1693) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ signal(2180) analysi(812) frequenc(800) }
{ gene(2352) biolog(1181) express(1162) }
{ data(3008) multipl(1320) sourc(1022) }
{ first(2504) two(1366) second(1323) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ use(1733) differ(960) four(931) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ decis(3086) make(1611) patient(1517) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }
{ method(2212) result(1239) propos(1039) }

Resumo

This study aimed to reduce reliance on large training datasets in support vector machine (SVM)-based clinical text analysis by categorizing keyword features. An enhanced Mayo smoking status detection pipeline was deployed. We used a corpus of 709 annotated patient narratives. The pipeline was optimized for local data entry practice and lexicon. SVM classifier retraining used a grouped keyword approach for better efficiency. Accuracy, precision, and F-measure of the unaltered and optimized pipelines were evaluated using k-fold cross-validation. Initial accuracy of the clinical Text Analysis and Knowledge Extraction System (cTAKES) package was 0.69. Localization and keyword grouping improved system accuracy to 0.9 and 0.92, respectively. F-measures for current and past smoker classes improved from 0.43 to 0.81 and 0.71 to 0.91, respectively. Non-smoker and unknown-class F-measures were 0.96 and 0.98, respectively. Keyword grouping had no negative effect on performance, and decreased training time. Grouping keywords is a practical method to reduce training corpus size.

Resumo Limpo

studi aim reduc relianc larg train dataset support vector machin svmbase clinic text analysi categor keyword featur enhanc mayo smoke status detect pipelin deploy use corpus annot patient narrat pipelin optim local data entri practic lexicon svm classifi retrain use group keyword approach better effici accuraci precis fmeasur unalt optim pipelin evalu use kfold crossvalid initi accuraci clinic text analysi knowledg extract system ctake packag local keyword group improv system accuraci respect fmeasur current past smoker class improv respect nonsmok unknownclass fmeasur respect keyword group negat effect perform decreas train time group keyword practic method reduc train corpus size

Resumos Similares

J Am Med Inform Assoc - Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. ( 0,74991248223288 )
J Am Med Inform Assoc - Learning regular expressions for clinical text classification. ( 0,723337445018256 )
AMIA Annu Symp Proc - Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing. ( 0,719752814332141 )
J Am Med Inform Assoc - Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. ( 0,715990772181424 )
BMC Med Inform Decis Mak - Recognition of medication information from discharge summaries using ensembles of classifiers. ( 0,714925484766491 )
J Am Med Inform Assoc - A comprehensive study of named entity recognition in Chinese clinical text. ( 0,708900637451485 )
AMIA Annu Symp Proc - Identifying discourse connectives in biomedical text. ( 0,698688156585824 )
J Med Syst - A new approach for concealed information identification based on ERP assessment. ( 0,686221831782177 )
J Am Med Inform Assoc - Vaccine adverse event text mining system for extracting features from vaccine safety reports. ( 0,669965013050886 )
J Digit Imaging - Computer-aided diagnosis of malignant mammograms using Zernike moments and SVM. ( 0,665101059746525 )
Artif Intell Med - Document classification for mining host pathogen protein-protein interactions. ( 0,660869831336548 )
J Biomed Inform - Dynamic categorization of clinical research eligibility criteria by hierarchical clustering. ( 0,656323095417206 )
AMIA Annu Symp Proc - TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. ( 0,652874375274073 )
J Med Syst - Enhanced cancer recognition system based on random forests feature elimination algorithm. ( 0,649507226659012 )
J Am Med Inform Assoc - Missing values in deduplication of electronic patient data. ( 0,643632579170326 )
J Biomed Inform - A biological continuum based approach for efficient clinical classification. ( 0,642752418877887 )
Comput Biol Chem - Information-theoretic approaches to SVM feature selection for metagenome read classification. ( 0,639273723286225 )
J Integr Bioinform - On the parameter optimization of Support Vector Machines for binary classification. ( 0,638625106801002 )
J Biomed Inform - An enhanced CRFs-based system for information extraction from radiology reports. ( 0,637998631284681 )
J Am Med Inform Assoc - Using statistical text classification to identify health information technology incidents. ( 0,637890315657027 )
J Am Med Inform Assoc - A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction. ( 0,636252048017671 )
J Biomed Inform - Enhancing clinical concept extraction with distributional semantics. ( 0,635864891582084 )
AMIA Annu Symp Proc - Automatic identification of critical follow-up recommendation sentences in radiology reports. ( 0,633227775939643 )
IEEE J Biomed Health Inform - Automatic detection of atrial fibrillation in cardiac vibration signals. ( 0,632689905038583 )
J Am Med Inform Assoc - Pneumonia identification using statistical feature selection. ( 0,632524378741558 )
J Am Med Inform Assoc - A system for coreference resolution for the clinical narrative. ( 0,631456352882725 )
J Med Syst - Automated screening of arrhythmia using wavelet based machine learning techniques. ( 0,628905278275652 )
AMIA Annu Symp Proc - Na?ve Electronic Health Record phenotype identification for Rheumatoid arthritis. ( 0,628104246484763 )
J Am Med Inform Assoc - Automatic discourse connective detection in biomedical text. ( 0,627081822327504 )
J Med Syst - Luminance sticker based facial expression recognition using discrete wavelet transform for physically disabled persons. ( 0,623155759821858 )
BMC Med Inform Decis Mak - Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features. ( 0,623065603253161 )
J Am Med Inform Assoc - Applying active learning to supervised word sense disambiguation in MEDLINE. ( 0,620173672946042 )
J Biomed Inform - Automatic figure classification in bioscience literature. ( 0,619282285765373 )
J Med Syst - Classification of speech dysfluencies using LPC based parameterization techniques. ( 0,61379081245329 )
J Med Syst - A computer aided diagnosis system for thyroid disease using extreme learning machine. ( 0,61035093491607 )
J Am Med Inform Assoc - Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification. ( 0,610166587112582 )
J Am Med Inform Assoc - Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. ( 0,609136972766864 )
AMIA Annu Symp Proc - Detecting abbreviations in discharge summaries using machine learning methods. ( 0,609099453746734 )
J Med Syst - Detection and localization of myocardial infarction using K-nearest neighbor classifier. ( 0,607444481528084 )
Comput Methods Programs Biomed - Computer-aided diagnosis system: a Bayesian hybrid classification method. ( 0,606337538095405 )
IEEE Trans Image Process - A novel technique for subpixel image classification based on support vector machine. ( 0,606145206091633 )
Comput Math Methods Med - An ensemble-of-classifiers based approach for early diagnosis of Alzheimer's disease: classification using structural features of brain images. ( 0,600191827984276 )
Int J Comput Assist Radiol Surg - Building an ensemble system for diagnosing masses in mammograms. ( 0,599319115580836 )
AMIA Annu Symp Proc - Automatically classifying the role of citations in biomedical articles. ( 0,598435645595877 )
Artif Intell Med - Conceptual-driven classification for coding advise in health insurance reimbursement. ( 0,598319214075381 )
J Biomed Inform - A natural language processing pipeline for pairing measurements uniquely across free-text CT reports. ( 0,597537331581058 )
J Am Med Inform Assoc - Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. ( 0,597446880931121 )
J Am Med Inform Assoc - Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text. ( 0,597386590273892 )
Comput. Biol. Med. - Ensemble selection for feature-based classification of diabetic maculopathy images. ( 0,59695891783334 )
Comput Biol Chem - A novel divide-and-merge classification for high dimensional datasets. ( 0,594833351039288 )
Comput. Biol. Med. - Heartbeat classification using disease-specific feature selection. ( 0,594698648798754 )
J Am Med Inform Assoc - Automated concept-level information extraction to reduce the need for custom software and rules development. ( 0,593316090028792 )
IEEE J Biomed Health Inform - Computer-aided diagnosis in hysteroscopic imaging. ( 0,593224177650614 )
J Am Med Inform Assoc - Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers. ( 0,592702939192771 )
Comput. Biol. Med. - An ensemble system for automatic sleep stage classification using single channel EEG signal. ( 0,592560212794852 )
IEEE J Biomed Health Inform - Support Vector Feature Selection for Early Detection of Anastomosis Leakage from Bag-of-Words in Electronic Health Records. ( 0,591609892065878 )
Int J Med Inform - De-identification of clinical narratives through writing complexity measures. ( 0,590947762012061 )
Comput Methods Programs Biomed - Computer-supported diagnosis for endotension cases in endovascular aortic aneurysm repair evolution. ( 0,587732822711152 )
J Am Med Inform Assoc - MITRE system for clinical assertion status classification. ( 0,585366326864994 )
Comput. Biol. Med. - Contourlet-based mammography mass classification using the SVM family. ( 0,584234976114696 )
Artif Intell Med - Texture feature ranking with relevance learning to classify interstitial lung disease patterns. ( 0,581990766535265 )
J Am Med Inform Assoc - Named entity recognition of follow-up and time information in 20,000 radiology reports. ( 0,581511795188858 )
J Biomed Inform - The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. ( 0,581072462423538 )
J Am Med Inform Assoc - Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals. ( 0,58063088733396 )
Comput Methods Programs Biomed - ECG beat classification using a cost sensitive classifier. ( 0,580281082394808 )
J Am Med Inform Assoc - Applying active learning to high-throughput phenotyping algorithms for electronic health records data. ( 0,57988934005784 )
AMIA Annu Symp Proc - Predicting discharge mortality after acute ischemic stroke using balanced data. ( 0,578198559104218 )
J Biomed Inform - Ontology-guided feature engineering for clinical text classification. ( 0,578117412440892 )
J Med Syst - A software framework for building biomedical machine learning classifiers through grid computing resources. ( 0,577079842631366 )
J Am Med Inform Assoc - Automatic identification of methotrexate-induced liver toxicity in patients with rheumatoid arthritis from the electronic medical record. ( 0,576732061875012 )
Comput Methods Programs Biomed - An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms. ( 0,575254329926815 )
AMIA Annu Symp Proc - Improving predictions in imbalanced data using Pairwise Expanded Logistic Regression. ( 0,574935728395455 )
AMIA Annu Symp Proc - Generalizability and comparison of automatic clinical text de-identification methods and resources. ( 0,574914971670234 )
Med Biol Eng Comput - Evaluation of feature extraction methods for EEG-based brain-computer interfaces in terms of robustness to slight changes in electrode locations. ( 0,574454005874012 )
J Am Med Inform Assoc - A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge. ( 0,574006431673202 )
Artif Intell Med - Automatic detection of epileptic seizures on the intra-cranial electroencephalogram of rats using reservoir computing. ( 0,573884365646535 )
J Am Med Inform Assoc - Diagnosis code assignment: models and evaluation metrics. ( 0,573306831804415 )
Comput. Biol. Med. - A new dataset evaluation method based on category overlap. ( 0,573299251868654 )
J Biomed Inform - Lessons learnt from the DDIExtraction-2013 Shared Task. ( 0,57318894948003 )
Comput Methods Programs Biomed - Complex extreme learning machine applications in terahertz pulsed signals feature sets. ( 0,572081536596624 )
Artif Intell Med - Classification of small lesions on dynamic breast MRI: Integrating dimension reduction and out-of-sample extension into CADx methodology. ( 0,571733249255804 )
Artif Intell Med - An intelligent classifier for prognosis of cardiac resynchronization therapy based on speckle-tracking echocardiograms. ( 0,571581470191684 )
J Am Med Inform Assoc - A flexible framework for recognizing events, temporal expressions, and temporal relations in clinical text. ( 0,570293776962193 )
Artif Intell Med - Figure classification in biomedical literature to elucidate disease mechanisms, based on pathways. ( 0,569691157878474 )
Comput. Biol. Med. - Classification of diffusion tensor images for the early detection of Alzheimer's disease. ( 0,569129709191728 )
Artif Intell Med - Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. ( 0,568687307947309 )
J Biomed Inform - A medical diagnostic tool based on radial basis function classifiers and evolutionary simulated annealing. ( 0,568230759551407 )
Comput. Biol. Med. - Classification of Error-Related Negativity (ERN) and Positivity (Pe) potentials using kNN and Support Vector Machines. ( 0,56817627716771 )
IEEE Trans Image Process - A unified feature and instance selection framework using optimum experimental design. ( 0,568139777121947 )
Brief. Bioinformatics - Class-imbalanced classifiers for high-dimensional data. ( 0,567626002596717 )
J Med Syst - Symptomatic vs. asymptomatic plaque classification in carotid ultrasound. ( 0,56705365925504 )
AMIA Annu Symp Proc - Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations. ( 0,566239786946494 )
Comput. Biol. Med. - Investigating the performance improvement of HRV Indices in CHF using feature selection methods based on backward elimination and statistical significance. ( 0,564965787519909 )
J Med Syst - A robust multi-class feature selection strategy based on Rotation Forest Ensemble algorithm for diagnosis of Erythemato-Squamous diseases. ( 0,564211267124103 )
Comput. Biol. Med. - Fast and efficient lung disease classification using hierarchical one-against-all support vector machine and cost-sensitive feature selection. ( 0,564037981453275 )
Comput Math Methods Med - Feature selection in classification of eye movements using electrooculography for activity recognition. ( 0,563947318173431 )
AMIA Annu Symp Proc - Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations. ( 0,561998825394513 )
J Biomed Inform - Boosting performance of gene mention tagging system by hybrid methods. ( 0,561894662380074 )
Comput Biol Chem - CE-PLoc: an ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition. ( 0,561051089583785 )
J Am Med Inform Assoc - Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. ( 0,560782246818447 )