Abstract
A Toolbox for Geographic Masking
Track: Census and Statistical Data
Authors: Paul Zandbergen, Su Zhang
When locations of individual-level health data are released in the form of paper or digital maps, these individuals could be re-identified through reverse geocoding. Spatial datasets can therefore not be released unless the locations have been modified, for example using aggregation or geographic masking. Masking techniques apply transformations or perturbations to prevent the re-identification of individuals. As part of a larger project on spatial data confidentiality a toolbox was developed for geographic masking of individual-level datasets. Masking techniques include both existing and newly developed algorithms: 1) random direction, fixed radius; 2) random perturbation within a circle; 3) Gaussian displacement; 4) donut masking; 5) bimodal Gaussian displacement; 6) location swapping; and 7) location swapping with donut masking. The ArcGIS-based toolbox was implemented using Modelbuilder and Python scripting. The performance of the various masking tools was validated using measures of spatial ?-anonymity which provides an empirical estimate of the probability of discovery.