J Biomed Inform - Controlling false match rates in record linkage using extreme value theory.

Tópicos

{ detect(2391) sensit(1101) algorithm(908) }
{ method(1219) similar(1157) match(930) }
{ data(3963) clinic(1234) research(1004) }
{ model(3404) distribut(989) bayesian(671) }
{ research(1218) medic(880) student(794) }
{ can(981) present(881) function(850) }
{ error(1145) method(1030) estim(1020) }
{ import(1318) role(1303) understand(862) }
{ result(1111) use(1088) new(759) }
{ method(2212) result(1239) propos(1039) }
{ patient(2315) diseas(1263) diabet(1191) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ extract(1171) text(1153) clinic(932) }
{ ehr(2073) health(1662) electron(1139) }
{ use(1733) differ(960) four(931) }
{ can(774) often(719) complex(702) }
{ concept(1167) ontolog(924) domain(897) }
{ featur(1941) imag(1645) propos(1176) }
{ visual(1396) interact(850) tool(830) }
{ record(1888) medic(1808) patient(1693) }
{ estim(2440) model(1874) function(577) }
{ data(1737) use(1416) pattern(1282) }
{ inform(2794) health(2639) internet(1427) }
{ measur(2081) correl(1212) valu(896) }
{ featur(3375) classif(2383) classifi(1994) }
{ network(2748) neural(1063) input(814) }
{ treatment(1704) effect(941) patient(846) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ clinic(1479) use(1117) guidelin(835) }
{ method(1557) propos(1049) approach(1037) }
{ design(1359) user(1324) use(1319) }
{ research(1085) discuss(1038) issu(1018) }
{ studi(1119) effect(1106) posit(819) }
{ spatial(1525) area(1432) region(1030) }
{ data(2317) use(1299) case(1017) }
{ signal(2180) analysi(812) frequenc(800) }
{ cost(1906) reduc(1198) effect(832) }
{ group(2977) signific(1463) compar(1072) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ analysi(2126) use(1163) compon(1037) }
{ structur(1116) can(940) graph(676) }
{ cancer(2502) breast(956) screen(824) }
{ decis(3086) make(1611) patient(1517) }
{ imag(1947) propos(1133) code(1026) }
{ system(1976) rule(880) can(841) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ imag(2830) propos(1344) filter(1198) }
{ imag(2675) segment(2577) method(1081) }
{ take(945) account(800) differ(722) }
{ studi(2440) review(1878) systemat(933) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ algorithm(1844) comput(1787) effici(935) }
{ data(1714) softwar(1251) tool(1186) }
{ control(1307) perform(991) simul(935) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ general(901) number(790) one(736) }
{ method(984) reconstruct(947) comput(926) }
{ search(2224) databas(1162) retriev(909) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ risk(3053) factor(974) diseas(938) }
{ perform(999) metric(946) measur(919) }
{ system(1050) medic(1026) inform(1018) }
{ model(2341) predict(2261) use(1141) }
{ compound(1573) activ(1297) structur(1058) }
{ perform(1367) use(1326) method(1137) }
{ blood(1257) pressur(1144) flow(957) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ state(1844) use(1261) util(961) }
{ patient(2837) hospit(1953) medic(668) }
{ model(2656) set(1616) predict(1553) }
{ age(1611) year(1155) adult(843) }
{ medic(1828) order(1363) alert(1069) }
{ sampl(1606) size(1419) use(1276) }
{ data(3008) multipl(1320) sourc(1022) }
{ intervent(3218) particip(2042) group(1664) }
{ activ(1138) subject(705) human(624) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ health(1844) social(1437) communiti(874) }
{ high(1669) rate(1365) level(1280) }
{ use(976) code(926) identifi(902) }
{ drug(1928) target(777) effect(648) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ process(1125) use(805) approach(778) }
{ activ(1452) weight(1219) physic(1104) }
{ method(1969) cluster(1462) data(1082) }

Resumo

Cleansing data from synonyms and homonyms is a relevant task in fields where high quality of data is crucial, for example in disease registries and medical research networks. Record linkage provides methods for minimizing synonym and homonym errors thereby improving data quality. We focus our attention to the case of homonym errors (in the following denoted as 'false matches'), in which records belonging to different entities are wrongly classified as equal. Synonym errors ('false non-matches') occur when a single entity maps to multiple records in the linkage result. They are not considered in this study because in our application domain they are not as crucial as false matches. False match rates are frequently computed manually through a clerical review, so without modelling the distribution of the false match rates a priori. An exception is the work of Belin and Rubin (1995) [4]. They propose to estimate the false match rate by means of a normal mixture model that needs training data for a calibration process. In this paper we present a new approach for estimating the false match rate within the framework of Fellegi and Sunter by methods of Extreme Value Theory (EVT). This approach needs no training data for determining the threshold for matches and therefore leads to a significant cost-reduction. After giving two different definitions of the false match rate, we present the tools of the EVT used in this paper: the generalized Pareto distribution and the mean excess plot. Our experiments with real data show that the model works well, with only slightly lower accuracy compared to a procedure that has information about the match status and that maximizes the accuracy.

Resumo Limpo

cleans data synonym homonym relev task field high qualiti data crucial exampl diseas registri medic research network record linkag provid method minim synonym homonym error therebi improv data qualiti focus attent case homonym error follow denot fals match record belong differ entiti wrong classifi equal synonym error fals nonmatch occur singl entiti map multipl record linkag result consid studi applic domain crucial fals match fals match rate frequent comput manual cleric review without model distribut fals match rate priori except work belin rubin propos estim fals match rate mean normal mixtur model need train data calibr process paper present new approach estim fals match rate within framework fellegi sunter method extrem valu theori evt approach need train data determin threshold match therefor lead signific costreduct give two differ definit fals match rate present tool evt use paper general pareto distribut mean excess plot experi real data show model work well slight lower accuraci compar procedur inform match status maxim accuraci

Resumos Similares

Comput. Biol. Med. - A correlation analysis-based detection and delineation of ECG characteristic events using template waveforms extracted by ensemble averaging of clustered heart cycles. ( 0,667713927604512 )
Med Biol Eng Comput - Quasi real-time gait event detection using shank-attached gyroscopes. ( 0,637745095331322 )
Res Synth Methods - Methods for the joint meta-analysis of multiple tests. ( 0,628745402062133 )
Comput Math Methods Med - Smart spotting of pulmonary TB cavities using CT images. ( 0,618007248868247 )
Int J Neural Syst - Kernel collaborative representation-based automatic seizure detection in intracranial EEG. ( 0,603515332859815 )
Int J Med Inform - Validating an ontology-based algorithm to identify patients with type 2 diabetes mellitus in electronic health records. ( 0,601482535098184 )
Comput. Biol. Med. - A bilateral analysis scheme for false positive reduction in mammogram mass detection. ( 0,599863750127829 )
J. Comput. Biol. - Shape-based feature matching improves protein identification via LC-MS and tandem MS. ( 0,59911213283029 )
Int J Comput Assist Radiol Surg - Hybrid method for the detection of pulmonary nodules using positron emission tomography/computed tomography: a preliminary study. ( 0,59849466224925 )
Comput. Biol. Med. - An IVUS image-based approach for improvement of coronary plaque characterization. ( 0,597642032534763 )
Int J Neural Syst - Automated seizure detection using EKG. ( 0,59426790674361 )
IEEE J Biomed Health Inform - Automatic identification and classification of muscle spasms in long-term EMG recordings. ( 0,586257600730871 )
Med Biol Eng Comput - Analysis of retinal fundus images for grading of diabetic retinopathy severity. ( 0,585046557698514 )
J Am Med Inform Assoc - Use of computerized algorithm to identify individuals in need of testing for celiac disease. ( 0,583981410266831 )
IEEE Trans Pattern Anal Mach Intell - Automatic and Accurate Shadow Detection using Near-Infrared Information. ( 0,58123447561504 )
IEEE Trans Pattern Anal Mach Intell - Robust Text Detection in Natural Scene Images. ( 0,58064162168348 )
Med Biol Eng Comput - GPU-based real-time detection and analysis of biological targets using solid-state nanopores. ( 0,579349755132698 )
AMIA Annu Symp Proc - Optimized dual threshold entity resolution for electronic health record databases--training set size and active learning. ( 0,576362555548284 )
Int J Med Robot - Robotic system with sweeping palpation and needle biopsy for prostate cancer diagnosis. ( 0,57246095495211 )
Med Biol Eng Comput - Detection of movement-related cortical potentials based on subject-independent training. ( 0,57139348432183 )
J Am Med Inform Assoc - A benchmark comparison of deterministic and probabilistic methods for defining manual review datasets in duplicate records reconciliation. ( 0,56586432863773 )
J Am Med Inform Assoc - Phenotyping for patient safety: algorithm development for electronic health record based automated adverse event and medical error detection in neonatal intensive care. ( 0,565070905481737 )
Comput Math Methods Med - New estimators and guidelines for better use of fetal heart rate estimators with Doppler ultrasound devices. ( 0,562693369356226 )
Comput Methods Programs Biomed - Blood vessel segmentation methodologies in retinal images--a survey. ( 0,561515373638934 )
J Biomed Inform - Automation of a high risk medication regime algorithm in a home health care population. ( 0,559959794659761 )
J Am Med Inform Assoc - Syndromic surveillance for health information system failures: a feasibility study. ( 0,559762094727049 )
Int J Comput Assist Radiol Surg - Combination of computer-aided detection algorithms for automatic lung nodule identification. ( 0,5518242799809 )
AMIA Annu Symp Proc - Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. ( 0,551732991628888 )
J Am Med Inform Assoc - Adjusting outbreak detection algorithms for surveillance during epidemic and non-epidemic periods. ( 0,551460833970064 )
Comput. Biol. Med. - Automated detection of the osseous acetabular rim using three-dimensional models of the pelvis. ( 0,54779616220945 )
AMIA Annu Symp Proc - Anomaly detection in clinical processes. ( 0,546404178929332 )
AMIA Annu Symp Proc - Computer surveillance of patients at high risk for and with venous thromboembolism. ( 0,542299535607802 )
J Biomed Inform - Identifying well-formed biomedical phrases in MEDLINE? text. ( 0,541575050097421 )
J Med Syst - Field programmable gate array based fuzzy neural signal processing system for differential diagnosis of QRS complex tachycardia and tachyarrhythmia in noisy ECG signals. ( 0,539762531172667 )
Comput Methods Programs Biomed - Ca analysis: an Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis. ( 0,53881879682685 )
J Clin Monit Comput - Pulse oximetry saturation patterns detect repetitive reductions in airflow. ( 0,536633746862952 )
IEEE Trans Image Process - Lightweight detection of additive watermarking in the DWT-domain. ( 0,534759938529604 )
IEEE Trans Pattern Anal Mach Intell - Domain Anomaly Detection in Machine Perception: A System Architecture and Taxonomy. ( 0,533819768733413 )
J Telemed Telecare - Recognition of root canal orifices at a distance - a preliminary study of teledentistry. ( 0,533779444810617 )
J. Med. Internet Res. - FluBreaks: early epidemic detection from Google flu trends. ( 0,532862174652669 )
Comput Math Methods Med - Automatic detection and quantification of WBCs and RBCs using iterative structured circle detection algorithm. ( 0,532639149088874 )
BMC Med Inform Decis Mak - Detecting modification of biomedical events using a deep parsing approach. ( 0,527743584020587 )
J Clin Monit Comput - Detection of endobronchial intubation by monitoring the CO2 level above the endotracheal cuff. ( 0,526147992169793 )
J Am Med Inform Assoc - A simple heuristic for blindfolded record linkage. ( 0,525836547568439 )
Appl Clin Inform - Towards prevention of acute syndromes: electronic identification of at-risk patients during hospital admission. ( 0,525779472613792 )
IEEE J Biomed Health Inform - Robust and sensitive video motion detection for sleep analysis. ( 0,5247710037205 )
Comput. Biol. Med. - A user-operated test of suprathreshold acuity in noise for adult hearing screening: The SUN (Speech Understanding in Noise) test. ( 0,523780291149328 )
J Telemed Telecare - Diabetic retinopathy screening using tele-ophthalmology in a primary care setting. ( 0,522080948312796 )
Comput Methods Programs Biomed - Automated detection of exudates and macula for grading of diabetic macular edema. ( 0,521822591363519 )
IEEE J Biomed Health Inform - Support Vector Feature Selection for Early Detection of Anastomosis Leakage from Bag-of-Words in Electronic Health Records. ( 0,520331780687258 )
J. Comput. Biol. - Feature detection with controlled error rates in LC/MS images. ( 0,520178853046837 )
IEEE J Biomed Health Inform - Automatic annotation of seismocardiogram with high-frequency precordial accelerations. ( 0,519604265079096 )
BMC Med Inform Decis Mak - Outbreak detection algorithms for seasonal disease data: a case study using Ross River virus disease. ( 0,518503545105381 )
Comput Methods Programs Biomed - Unsupervised skin lesions border detection via two-dimensional image analysis. ( 0,518498899217892 )
Comput. Biol. Med. - Automatic identification of fetal breathing movements in fetal RR interval time series. ( 0,516923348809196 )
J Biomed Inform - A controlled greedy supervised approach for co-reference resolution on clinical text. ( 0,516581045837803 )
AMIA Annu Symp Proc - Evaluating semantic relatedness and similarity measures with Standardized MedDRA Queries. ( 0,513956212475247 )
J Am Med Inform Assoc - Development and evaluation of a crowdsourcing methodology for knowledge base construction: identifying relationships between clinical problems and medications. ( 0,511705369096753 )
J Biomed Inform - An analysis of FMA using structural self-bisimilarity. ( 0,511321028514117 )
IEEE J Biomed Health Inform - Electrocardiogram classification using reservoir computing with logistic regression. ( 0,510182274588635 )
Comput Math Methods Med - Hypovigilance detection for UCAV operators based on a hidden Markov model. ( 0,509669238956756 )
Med Biol Eng Comput - Automatic breath-to-breath analysis of nocturnal polysomnographic recordings. ( 0,508970419693705 )
Int J Neural Syst - Multi-instance dictionary learning for detecting abnormal events in surveillance videos. ( 0,508718216288416 )
Comput Methods Programs Biomed - An automated decision-support system for non-proliferative diabetic retinopathy disease based on MAs and HAs detection. ( 0,507719589738098 )
Comput. Biol. Med. - Real-time electrocardiogram P-QRS-T detection-delineation algorithm based on quality-supported analysis of characteristic templates. ( 0,507432661436155 )
Med Biol Eng Comput - Abnormal localization of immature precursors (ALIP) detection for early prediction of acute myelocytic leukemia (AML) relapse. ( 0,506651042294941 )
Comput. Biol. Med. - Prediction of acute hypotensive episodes by means of neural network multi-models. ( 0,505632611314438 )
Appl Clin Inform - The impact of domain knowledge on structured data collection and templated note design. ( 0,502916860258634 )
Int J Comput Assist Radiol Surg - Diffusion tensor tractography of normal facial and vestibulocochlear nerves. ( 0,502177569051604 )
Comput Methods Programs Biomed - Identification of an integrated mathematical model of standard oral glucose tolerance test for characterization of insulin potentiation in health. ( 0,501895695782028 )
Neural Comput - Multiple tests based on a gaussian approximation of the unitary events method with delayed coincidence count. ( 0,50099731723639 )
Int J Comput Assist Radiol Surg - Computer-aided focal liver lesion detection. ( 0,499725301833029 )
Neural Comput - A multiscale correlation of wavelet coefficients approach to spike detection. ( 0,499329417506382 )
IEEE Trans Pattern Anal Mach Intell - Video Event Detection: From Subvolume Localization To Spatio-Temporal Path Search. ( 0,499322207720601 )
Comput. Biol. Med. - Tumor segmentation from computed tomography image data using a probabilistic pixel selection approach. ( 0,497100783592091 )
AMIA Annu Symp Proc - Development and validation of an electronic phenotyping algorithm for chronic kidney disease. ( 0,496916877548164 )
Artif Intell Med - Leucocyte classification for leukaemia detection using image processing techniques. ( 0,496875673058826 )
IEEE Trans Image Process - Retina verification system based on biometric graph matching. ( 0,496394646213711 )
BMC Med Inform Decis Mak - Adverse drug events with hyperkalaemia during inpatient stays: evaluation of an automated method for retrospective detection in hospital databases. ( 0,496062864331157 )
Int J Comput Assist Radiol Surg - Ventilatory impairment detection based on distribution of respiratory-induced changes in pixel values in dynamic chest radiography: a feasibility study. ( 0,495222838105338 )
IEEE Trans Image Process - Chromaticity space for illuminant invariant recognition. ( 0,495195638344919 )
Methods Inf Med - Monitoring nocturnal heart rate with bed sensor. ( 0,495180209634002 )
Med Biol Eng Comput - A robust method for online heart sound localization in respiratory sound based on temporal fuzzy c-means. ( 0,494458406505291 )
IEEE Trans Vis Comput Graph - Content-Aware Photo Collage Using Circle Packing. ( 0,49373512673332 )
Artif Intell Med - Using a multi-agent system approach for microaneurysm detection in fundus images. ( 0,493321892655354 )
AMIA Annu Symp Proc - A Weighty Problem: Identification, Characteristics and Risk Factors for Errors in EMR Data. ( 0,492410504721822 )
Int J Comput Assist Radiol Surg - Multi-contrast unbiased MRI atlas of a Parkinson's disease population. ( 0,49091604514468 )
IEEE Trans Image Process - A multiscale wavelet-based test for isotropy of random fields on a regular lattice. ( 0,490636259040113 )
IEEE Trans Pattern Anal Mach Intell - Scaling Multidimensional Inference for Structured Gaussian Processes. ( 0,489998963097015 )
IEEE Trans Pattern Anal Mach Intell - On Kleinberg's Stochastic Discrimination Procedure. ( 0,489786701001856 )
Comput Methods Programs Biomed - An automatic algorithm for the detection of Trypanosoma cruzi parasites in blood sample images. ( 0,488747367277472 )
Appl Clin Inform - Comparison of manual versus automated data collection method for an evidence-based nursing practice study. ( 0,488455682559937 )
Comput Methods Programs Biomed - Detection of heartbeat and respiration from optical interferometric signal by using wavelet transform. ( 0,487769958375654 )
Methods Inf Med - Detection algorithm for single motor unit firing in surface EMG of the trapezius muscle. ( 0,486514352214613 )
J Chem Inf Model - Models for identification of erroneous atom-to-atom mapping of reactions performed by automated algorithms. ( 0,486182309056416 )
BMC Med Inform Decis Mak - Evaluation of syndromic algorithms for detecting patients with potentially transmissible infectious diseases based on computerised emergency-department data. ( 0,485293123498899 )
IEEE Trans Image Process - Fast video shot boundary detection based on SVD and pattern matching. ( 0,484071673097699 )
J Med Syst - Automatic detection of the existence of subarachnoid hemorrhage from clinical CT images. ( 0,483938411083751 )
IEEE Trans Image Process - An algorithm for power line detection and warning based on a millimeter-wave radar video. ( 0,48298426896601 )
Int J Comput Assist Radiol Surg - Detection and quantification of intracerebral and intraventricular hemorrhage from computed tomography images with adaptive thresholding and case-based reasoning. ( 0,481857274702906 )