BMC Med Inform Decis Mak - A method for managing re-identification risk from small geographic areas in Canada.


{ model(2341) predict(2261) use(1141) }
{ spatial(1525) area(1432) region(1030) }
{ data(1714) softwar(1251) tool(1186) }
{ error(1145) method(1030) estim(1020) }
{ use(976) code(926) identifi(902) }
{ detect(2391) sensit(1101) algorithm(908) }
{ risk(3053) factor(974) diseas(938) }
{ method(1219) similar(1157) match(930) }
{ sampl(1606) size(1419) use(1276) }
{ control(1307) perform(991) simul(935) }
{ general(901) number(790) one(736) }
{ data(3008) multipl(1320) sourc(1022) }
{ featur(3375) classif(2383) classifi(1994) }
{ method(1557) propos(1049) approach(1037) }
{ method(1969) cluster(1462) data(1082) }
{ surgeri(1148) surgic(1085) robot(1054) }
{ system(1050) medic(1026) inform(1018) }
{ signal(2180) analysi(812) frequenc(800) }
{ activ(1138) subject(705) human(624) }
{ inform(2794) health(2639) internet(1427) }
{ patient(2315) diseas(1263) diabet(1191) }
{ take(945) account(800) differ(722) }
{ patient(2837) hospit(1953) medic(668) }
{ data(2317) use(1299) case(1017) }
{ age(1611) year(1155) adult(843) }
{ activ(1452) weight(1219) physic(1104) }
{ can(774) often(719) complex(702) }
{ concept(1167) ontolog(924) domain(897) }
{ search(2224) databas(1162) retriev(909) }
{ data(3963) clinic(1234) research(1004) }
{ perform(1367) use(1326) method(1137) }
{ record(1888) medic(1808) patient(1693) }
{ cost(1906) reduc(1198) effect(832) }
{ intervent(3218) particip(2042) group(1664) }
{ use(1733) differ(960) four(931) }
{ decis(3086) make(1611) patient(1517) }
{ process(1125) use(805) approach(778) }
{ model(3404) distribut(989) bayesian(671) }
{ imag(1947) propos(1133) code(1026) }
{ data(1737) use(1416) pattern(1282) }
{ system(1976) rule(880) can(841) }
{ measur(2081) correl(1212) valu(896) }
{ imag(1057) registr(996) error(939) }
{ bind(1733) structur(1185) ligand(1036) }
{ sequenc(1873) structur(1644) protein(1328) }
{ imag(2830) propos(1344) filter(1198) }
{ network(2748) neural(1063) input(814) }
{ imag(2675) segment(2577) method(1081) }
{ studi(2440) review(1878) systemat(933) }
{ motion(1329) object(1292) video(1091) }
{ assess(1506) score(1403) qualiti(1306) }
{ treatment(1704) effect(941) patient(846) }
{ framework(1458) process(801) describ(734) }
{ problem(2511) optim(1539) algorithm(950) }
{ chang(1828) time(1643) increas(1301) }
{ learn(2355) train(1041) set(1003) }
{ clinic(1479) use(1117) guidelin(835) }
{ algorithm(1844) comput(1787) effici(935) }
{ extract(1171) text(1153) clinic(932) }
{ design(1359) user(1324) use(1319) }
{ model(2220) cell(1177) simul(1124) }
{ care(1570) inform(1187) nurs(1089) }
{ method(984) reconstruct(947) comput(926) }
{ featur(1941) imag(1645) propos(1176) }
{ case(1353) use(1143) diagnosi(1136) }
{ howev(809) still(633) remain(590) }
{ studi(1410) differ(1259) use(1210) }
{ perform(999) metric(946) measur(919) }
{ research(1085) discuss(1038) issu(1018) }
{ import(1318) role(1303) understand(862) }
{ visual(1396) interact(850) tool(830) }
{ compound(1573) activ(1297) structur(1058) }
{ studi(1119) effect(1106) posit(819) }
{ blood(1257) pressur(1144) flow(957) }
{ health(3367) inform(1360) care(1135) }
{ model(3480) simul(1196) paramet(876) }
{ monitor(1329) mobil(1314) devic(1160) }
{ ehr(2073) health(1662) electron(1139) }
{ state(1844) use(1261) util(961) }
{ research(1218) medic(880) student(794) }
{ model(2656) set(1616) predict(1553) }
{ medic(1828) order(1363) alert(1069) }
{ group(2977) signific(1463) compar(1072) }
{ gene(2352) biolog(1181) express(1162) }
{ first(2504) two(1366) second(1323) }
{ time(1939) patient(1703) rate(768) }
{ patient(1821) servic(1111) care(1106) }
{ use(2086) technolog(871) perceiv(783) }
{ can(981) present(881) function(850) }
{ analysi(2126) use(1163) compon(1037) }
{ health(1844) social(1437) communiti(874) }
{ structur(1116) can(940) graph(676) }
{ high(1669) rate(1365) level(1280) }
{ cancer(2502) breast(956) screen(824) }
{ drug(1928) target(777) effect(648) }
{ result(1111) use(1088) new(759) }
{ implement(1333) system(1263) develop(1122) }
{ survey(1388) particip(1329) question(1065) }
{ estim(2440) model(1874) function(577) }
{ method(2212) result(1239) propos(1039) }


CKGROUND: A common disclosure control practice for health datasets is to identify small geographic areas and either suppress records from these small areas or aggregate them into larger ones. A recent study provided a method for deciding when an area is too small based on the uniqueness criterion. The uniqueness criterion stipulates that an the area is no longer too small when the proportion of unique individuals on the relevant variables (the quasi-identifiers) approaches zero. However, using a uniqueness value of zero is quite a stringent threshold, and is only suitable when the risks from data disclosure are quite high. Other uniqueness thresholds that have been proposed for health data are 5% and 20%.METHODS: We estimated uniqueness for urban Forward Sortation Areas (FSAs) by using the 2001 long form Canadian census data representing 20% of the population. We then constructed two logistic regression models to predict when the uniqueness is greater than the 5% and 20% thresholds, and validated their predictive accuracy using 10-fold cross-validation. Predictor variables included the population size of the FSA and the maximum number of possible values on the quasi-identifiers (the number of equivalence classes).RESULTS: All model parameters were significant and the models had very high prediction accuracy, with specificity above 0.9, and sensitivity at 0.87 and 0.74 for the 5% and 20% threshold models respectively. The application of the models was illustrated with an analysis of the Ontario newborn registry and an emergency department dataset. At the higher thresholds considerably fewer records compared to the 0% threshold would be considered to be in small areas and therefore undergo disclosure control actions. We have also included concrete guidance for data custodians in deciding which one of the three uniqueness thresholds to use (0%, 5%, 20%), depending on the mitigating controls that the data recipients have in place, the potential invasion of privacy if the data is disclosed, and the motives and capacity of the data recipient to re-identify the data.CONCLUSION: The models we developed can be used to manage the re-identification risk from small geographic areas. Being able to choose among three possible thresholds, a data custodian can adjust the definition of "small geographic area" to the nature of the data and recipient.

Resumo Limpo

ckground common disclosur control practic health dataset identifi small geograph area either suppress record small area aggreg larger one recent studi provid method decid area small base uniqu criterion uniqu criterion stipul area longer small proport uniqu individu relev variabl quasiidentifi approach zero howev use uniqu valu zero quit stringent threshold suitabl risk data disclosur quit high uniqu threshold propos health data method estim uniqu urban forward sortat area fsas use long form canadian census data repres popul construct two logist regress model predict uniqu greater threshold valid predict accuraci use fold crossvalid predictor variabl includ popul size fsa maximum number possibl valu quasiidentifi number equival classesresult model paramet signific model high predict accuraci specif sensit threshold model respect applic model illustr analysi ontario newborn registri emerg depart dataset higher threshold consider fewer record compar threshold consid small area therefor undergo disclosur control action also includ concret guidanc data custodian decid one three uniqu threshold use depend mitig control data recipi place potenti invas privaci data disclos motiv capac data recipi reidentifi dataconclus model develop can use manag reidentif risk small geograph area abl choos among three possibl threshold data custodian can adjust definit small geograph area natur data recipi

Resumos Similares

J Am Med Inform Assoc - A novel method of adverse event detection can accurately identify venous thromboembolisms (VTEs) from narrative electronic health record data. ( 0,673965328223867 )
Spat Spatiotemporal Epidemiol - Modeling habitat suitability for occurrence of highly pathogenic avian influenza virus H5N1 in domestic poultry in Asia: a spatial multicriteria decision analysis approach. ( 0,666672502308726 )
Int J Health Geogr - Ecological niche model of Phlebotomus alexandri and P. papatasi (Diptera: Psychodidae) in the Middle East. ( 0,652333783183152 )
Int J Health Geogr - Potential corridors and barriers for plague spread in Central Asia. ( 0,642962394196964 )
J Biomed Inform - Sparse modeling of spatial environmental variables associated with asthma. ( 0,629220527837593 )
J Am Med Inform Assoc - An improved model for predicting postoperative nausea and vomiting in ambulatory surgery patients using physician-modifiable risk factors. ( 0,622130575379778 )
Int J Health Geogr - A spatially filtered multilevel model to account for spatial dependency: application to self-rated health status in South Korea. ( 0,620250585535221 )
IEEE J Biomed Health Inform - The effect of sample age and prediction resolution on myocardial infarction risk prediction. ( 0,619824457251121 )
BMC Med Inform Decis Mak - A three-step approach for the derivation and validation of high-performing predictive models using an operational dataset: congestive heart failure readmission case study. ( 0,619460884389449 )
Int J Health Geogr - Performance map of a cluster detection test using extended power. ( 0,617522376510354 )
Int J Health Geogr - Prediction of high-risk areas for visceral leishmaniasis using socioeconomic indicators and remote sensing data. ( 0,61499981591845 )
Lifetime Data Anal - Estimating improvement in prediction with matched case-control designs. ( 0,614617968569028 )
Comput Methods Programs Biomed - Single stage and multistage classification models for the prediction of liver fibrosis degree in patients with chronic hepatitis C infection. ( 0,614438835573172 )
Int J Health Geogr - Environmental predictors of West Nile fever risk in Europe. ( 0,60802914402319 )
Int J Health Geogr - Mapping heatwave health risk at the community level for public health action. ( 0,606269762737747 )
Int J Health Geogr - Small-scale health-related indicator acquisition using secondary data spatial interpolation. ( 0,604589821136024 )
BMC Med Inform Decis Mak - Mining geriatric assessment data for in-patient fall prediction models and high-risk subgroups. ( 0,603205506406705 )
J Biomed Inform - Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches. ( 0,600457278152464 )
BMC Med Inform Decis Mak - Use of outcomes to evaluate surveillance systems for bioterrorist attacks. ( 0,599913041336087 )
Med Decis Making - Application of an artificial neural network to predict postinduction hypotension during general anesthesia. ( 0,594412044299995 )
Med Decis Making - Adaptation of clinical prediction models for application in local settings. ( 0,593875413146782 )
Methods Inf Med - A probabilistic model to investigate the properties of prognostic tools for falls. ( 0,590841958851542 )
J. Comput. Biol. - Prediction of siRNA potency using sparse logistic regression. ( 0,5879014593026 )
J Med Syst - Effective automated prediction of vertebral column pathologies based on logistic model tree with SMOTE preprocessing. ( 0,586075610276576 )
BMC Med Inform Decis Mak - Artificial neural network models for prediction of cardiovascular autonomic dysfunction in general Chinese population. ( 0,5830813701737 )
Comput Biol Chem - Using ensemble methods to deal with imbalanced data in predicting protein-protein interactions. ( 0,57932855567459 )
Int J Health Geogr - Urban slum structure: integrating socioeconomic and land cover data to model slum evolution in Salvador, Brazil. ( 0,576892690144736 )
J Biomed Inform - An empirical approach to model selection through validation for censored survival data. ( 0,575297362000548 )
Int J Health Geogr - Application of satellite precipitation data to analyse and model arbovirus activity in the tropics. ( 0,574572848675897 )
J Am Med Inform Assoc - Calibrating predictive model estimates to support personalized medicine. ( 0,57360341914877 )
Comput Math Methods Med - Prediction of BP reactivity to talking using hybrid soft computing approaches. ( 0,573517090385215 )
Lifetime Data Anal - Understanding increments in model performance metrics. ( 0,571733979953105 )
J Med Syst - Classifying hospitals as mortality outliers: logistic versus hierarchical logistic models. ( 0,571682203861175 )
Appl Clin Inform - Comparing predictions made by a prediction model, clinical score, and physicians: pediatric asthma exacerbations in the emergency department. ( 0,570435354410752 )
Med Decis Making - Performance profiling in primary care: does the choice of statistical model matter? ( 0,568390848934217 )
J Biomed Inform - The effects of data sources, cohort selection, and outcome definition on a predictive model of risk of thirty-day hospital readmissions. ( 0,566924564455554 )
Int J Med Inform - Application of data mining to the identification of critical factors in patient falls using a web-based reporting system. ( 0,566480349462884 )
BMC Med Inform Decis Mak - Harmonisation of variables names prior to conducting statistical analyses with multiple datasets: an automated approach. ( 0,565210549889909 )
Artif Intell Med - Predicting patient survival after liver transplantation using evolutionary multi-objective artificial neural networks. ( 0,563734493789866 )
Int J Health Geogr - Modeling tools for dengue risk mapping - a systematic review. ( 0,56235294882468 )
Methods Inf Med - Limited sampling strategies to estimate the area under the concentration-time curve. Biases and a proposed more accurate method. ( 0,561653144586995 )
BMC Med Inform Decis Mak - Non-linear dynamical signal characterization for prediction of defibrillation success through machine learning. ( 0,560204992644508 )
J Clin Monit Comput - Effect of concurrent oxygen therapy on accuracy of forecasting imminent postoperative desaturation. ( 0,55891479399384 )
Int J Health Geogr - The effects of deprivation and relative deprivation on self-reported morbidity in England: an area-level ecological study. ( 0,558716982793408 )
IEEE Trans Image Process - DEB: definite error bounded tangent estimator for digital curves. ( 0,558457653527576 )
Spat Spatiotemporal Epidemiol - Risk factor modelling of the spatio-temporal patterns of highly pathogenic avian influenza (HPAIV) H5N1: a review. ( 0,558148269574125 )
J Biomed Inform - Data mining methods for classification of Medium-Chain Acyl-CoA dehydrogenase deficiency (MCADD) using non-derivatized tandem MS neonatal screening data. ( 0,557797496338793 )
Int J Health Geogr - Assessing the effects of variables and background selection on the capture of the tick climate niche. ( 0,556039397019058 )
Med Decis Making - A comparison of methods for converting DCE values onto the full health-dead QALY scale. ( 0,555872699730203 )
J Clin Monit Comput - Complex signals bioinformatics: evaluation of heart rate characteristics monitoring as a novel risk marker for neonatal sepsis. ( 0,555398723383016 )
Comput Math Methods Med - Variable selection in ROC regression. ( 0,555193828018357 )
Methods Inf Med - Classification of postural profiles among mouth-breathing children by learning vector quantization. ( 0,553418421339712 )
BMC Med Inform Decis Mak - Evaluation of prediction models for the staging of prostate cancer. ( 0,553131562634657 )
Spat Spatiotemporal Epidemiol - Assessment of land use factors associated with dengue cases in Malaysia using Boosted Regression Trees. ( 0,552970928974998 )
J Clin Monit Comput - Use of genetic programming, logistic regression, and artificial neural nets to predict readmission after coronary artery bypass surgery. ( 0,552840594879368 )
Med Decis Making - Development of inpatient risk stratification models of acute kidney injury for use in electronic health records. ( 0,552827106189944 )
AMIA Annu Symp Proc - Clinical risk prediction by exploring high-order feature correlations. ( 0,551098534747065 )
Int J Health Geogr - Natural-focal diseases: mapping experience in Russia. ( 0,55043431426344 )
Comput. Biol. Med. - A ternary model of decompression sickness in rats. ( 0,549826106449239 )
J Chem Inf Model - Two new parameters based on distances in a receiver operating characteristic chart for the selection of classification models. ( 0,549771316101017 )
Comput Methods Programs Biomed - Exploring an optimal vector autoregressive model for multi-channel pulmonary sound data. ( 0,548883832015887 )
BMC Med Inform Decis Mak - Genotypic tropism testing by massively parallel sequencing: qualitative and quantitative analysis. ( 0,54795640008841 )
J Biomed Inform - Prediction of influenza vaccination outcome by neural networks and logistic regression. ( 0,547397016580417 )
Spat Spatiotemporal Epidemiol - Adjusted significance cutoffs for hypothesis tests applied with generalized additive models with bivariate smoothers. ( 0,544721399468047 )
J Biomed Inform - Partial least squares and logistic regression random-effects estimates for gene selection in supervised classification of gene expression data. ( 0,5434994085149 )
Med Biol Eng Comput - System identification of the mechanomyogram from single motor units during voluntary isometric contraction. ( 0,542608532560854 )
Comput Math Methods Med - Iterative reweighted noninteger norm regularizing SVM for gene expression data classification. ( 0,542433647403408 )
AMIA Annu Symp Proc - Predicting Surgical Risk: How Much Data is Enough? ( 0,540777658738299 )
Geospat Health - Fine-scale mapping of vector habitats using very high resolution satellite imagery: a liver fluke case-study. ( 0,539708915119349 )
Spat Spatiotemporal Epidemiol - Spatial and statistical methodologies to determine the distribution of dengue in Brazilian municipalities and relate incidence with the Health Vulnerability Index. ( 0,539614513823907 )
J Med Syst - A new approach: role of data mining in prediction of survival of burn patients. ( 0,538662973268743 )
Int J Health Geogr - Characterizing the interface between wild ducks and poultry to evaluate the potential of transmission of avian pathogens. ( 0,536805269030001 )
J Biomed Inform - PARAMO: a PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records. ( 0,536288953392387 )
Appl Clin Inform - Exploring the value of clinical data standards to predict hospitalization of home care patients. ( 0,535765588320519 )
Spat Spatiotemporal Epidemiol - Foot and mouth disease revisited: re-analysis using Bayesian spatial susceptible-infectious-removed models. ( 0,535087443669508 )
BMC Med Inform Decis Mak - A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. ( 0,534376446077806 )
Comput Methods Programs Biomed - Online analysis of in vitro resistance to antimalarial drugs through nonlinear regression. ( 0,534295488454239 )
Spat Spatiotemporal Epidemiol - Spatio-temporal modeling of the African swine fever epidemic in the Russian Federation, 2007-2012. ( 0,533660130474308 )
Spat Spatiotemporal Epidemiol - Spatio-temporal epidemiology of highly pathogenic avian influenza (subtype H5N1) in poultry in eastern India. ( 0,533388103234019 )
Int J Health Geogr - Effects of georeferencing effort on mapping monkeypox case distributions and transmission risk. ( 0,532801527767372 )
Methods Inf Med - An experimental evaluation of boosting methods for classification. ( 0,53240774633568 )
Comput. Biol. Med. - Pre-operative prediction of surgical morbidity in children: comparison of five statistical models. ( 0,531440532009623 )
Comput Biol Chem - Determining common insertion sites based on retroviral insertion distribution across tumors. ( 0,531193856824419 )
Artif Intell Med - Operation room tool handling and miscommunication scenarios: an object-process methodology conceptual model. ( 0,531067974741649 )
AMIA Annu Symp Proc - Development and implementation of a real-time 30-day readmission predictive model. ( 0,529091794083424 )
Int J Health Geogr - Spatial analysis of learning and developmental disorders in upper Cape Cod, Massachusetts using generalized additive models. ( 0,528183070409986 )
BMC Med Inform Decis Mak - Computerized prediction of intensive care unit discharge after cardiac surgery: development and validation of a Gaussian processes model. ( 0,52775072182627 )
Int J Health Geogr - Developing the atlas of cancer in Queensland: methodological issues. ( 0,526298311903955 )
AMIA Annu Symp Proc - Developing predictive models using electronic medical records: challenges and pitfalls. ( 0,526070053191206 )
Comput. Biol. Med. - Statistical model based 3D shape prediction of postoperative trunks for non-invasive scoliosis surgery planning. ( 0,524963092048116 )
Comput Math Methods Med - General error analysis in the relationship between free thyroxine and thyrotropin and its clinical relevance. ( 0,523626194217433 )
J Am Med Inform Assoc - Automated identification of extreme-risk events in clinical incident reports. ( 0,522588908918515 )
Int J Health Geogr - Does context matter for the relationship between deprivation and all-cause mortality? The West vs. the rest of Scotland. ( 0,521931572382896 )
Int J Health Geogr - The effect of spatial aggregation on performance when mapping a risk of disease. ( 0,521762032222449 )
Comput Methods Programs Biomed - Development of a daily mortality probability prediction model from Intensive Care Unit patients using a discrete-time event history analysis. ( 0,521641529808464 )
IEEE Trans Image Process - Network-based H.264/AVC whole frame loss visibility model and frame dropping methods. ( 0,520886340211917 )
Brief. Bioinformatics - Caveats and pitfalls of ROC analysis in clinical microarray research (and how to avoid them). ( 0,519516853330707 )
J Am Med Inform Assoc - From vital signs to clinical outcomes for patients with sepsis: a machine learning basis for a clinical decision support system. ( 0,519238311846632 )
J Chem Inf Model - dREL: a relational expression language for dictionary methods. ( 0,519063815519868 )
J Clin Monit Comput - Monitoring the nociception level: a multi-parameter approach. ( 0,518205005872093 )