881
Views
2
CrossRef citations to date
0
Altmetric
Articles

A Computational Framework for Preserving Privacy and Maintaining Utility of Geographically Aggregated Data: A Stochastic Spatial Optimization Approach

ORCID Icon & ORCID Icon
Pages 1035-1056 | Received 09 Sep 2022, Accepted 24 Jan 2023, Published online: 20 Mar 2023

References

  • Abowd, J., R. Ashmead, G. Simson, D. Kifer, P. Leclerc, A. Machanavajjhala, and W. Sexton. 2019. Census TopDown: Differentially private data, incremental schemas, and consistency with public knowledge. Technical report. U.S. Census Bureau, Washington, DC.
  • Almasi, M. M., T. R. Siddiqui, N. Mohammed, and H. Hemmati. 2016. The risk-utility tradeoff for data privacy models. Paper presented at the 2016 8th IFIP International Conference on New Technologies, Mobility and Security (NTMS), Larnaca, Cyprus, November 21–23. doi: 10.1109/NTMS.2016.7792481.
  • Anselin, L. 1995. Local indicators of spatial association—LISA. Geographical Analysis 27 (2):93–115. doi: 10.1111/j.1538-4632.1995.tb00338.x.
  • Bethlehem, J. G., W. J. Keller, and J. Pannekoek. 1990. Disclosure control of microdata. Journal of the American Statistical Association 85 (409):38–45. doi: 10.1080/01621459.1990.10475304.
  • Bun, M., and T. Steinke. 2016. Concentrated differential privacy: Simplifications, extensions, and lower bounds. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9985:635–58.
  • Chang, Y.-F., C.-J. Lin, J. Chyan, I. Chen, and J.-E. Chang. 2007. Multiple regression models for the lower heating value of municipal solid waste in Taiwan. Journal of Environmental Management 85 (4):891–99. doi: 10.1016/j.jenvman.2006.10.025.
  • Church, R., and C. ReVelle. 1974. The maximal covering location problem. Papers of the Regional Science Association 32:101–18. doi: 10.1007/BF01942293.
  • Cohen, A., and K. Nissim. 2020. Towards formalizing the GDPR’s notion of singling out. Proceedings of the National Academy of Sciences of the United States of America 117 (15):8344–52. doi: 10.1073/pnas.1914598117.
  • Cox, L. H. 1980. Suppression methodology and statistical disclosure control. Journal of the American Statistical Association 75 (370):377–85. doi: 10.1080/01621459.1980.10477481.
  • Curtis, A., J. W. Mills, L. Agustin, and M. Cockburn. 2011. Confidentiality risks in fine scale aggregations of health data. Computers, Environment and Urban Systems 35 (1):57–64. doi: 10.1016/j.compenvurbsys.2010.08.002.
  • Dalenius, T., and S. P. Reiss. 1982. Data-swapping: A technique for disclosure control. Journal of Statistical Planning and Inference 6 (1):73–85. doi: 10.1016/0378-3758(82)90058-1.
  • Deb, K. 2001. Multi-objective optimization using evolutionary algorithms. New York: Wiley.
  • Domingo-Ferrer, J., D. Sánchez, and A. Blanco-Justicia. 2021. The limits of differential privacy (and its misuse in data release and machine learning). Communications of the ACM 64 (7):33–35. doi: 10.1145/3433638.
  • Domingo-Ferrer, J., and V. Torra. 2001. Disclosure control methods and information loss for microdata. In Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, ed. P. Doyle, J. I. Lane, J. J. M. Theeuwes, and L. M. Zayatz, 91–110. Amsterdam, The Netherlands: North Holland.
  • Duke-Williams, O., and P. Rees. 1998. Can census offices publish statistics for more than one small area geography? An analysis of the differencing problem in statistical disclosure. International Journal of Geographical Information Science: IJGIS 12 (6):579–605. doi: 10.1080/136588198241680.
  • Dwork, C., F. McSherry, K. Nissim, and A. Smith. 2006. Calibrating noise to sensitivity in private data analysis. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3876, 265–84.
  • Dwork, C., and A. Roth. 2014. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science 9 (3–4):211–407. doi: 10.1561/0400000042.
  • Dwork, C., A. Smith, T. Steinke, and J. Ullman. 2017. Exposed! A survey of attacks on private data. Annual Review of Statistics and Its Application 4:61–84. doi: 10.1146/annurev-statistics-060116-054123.
  • Garfinkel, S., J. Abowd, and C. Martindale. 2019. Understanding database reconstruction attacks on public data. Communications of the ACM 62 (3):46–53. doi: 10.1145/3287287.
  • Gatabazi, P., J. C. Mba, and E. Pindza. 2019. Modeling cryptocurrencies transaction counts using variable-order fractional grey lotka-volterra dynamical system. Chaos, Solitons & Fractals 127:283–90. doi: 10.1016/j.chaos.2019.07.003.
  • Getis, A. 2010. Spatial autocorrelation. In Handbook of applied spatial analysis, ed. M. Fischer and A. Getis, 255–78. Berlin, Heidelberg: Springer.
  • Gurobi Optimization, LLC. 2021. Gurobi Optimizer reference manual. https://www.gurobi.com.
  • Kelly, J. P., B. L. Golden, and A. A. Assad. 1992. Cell suppression: Disclosure protection for sensitive tabular data. Networks 22 (4):397–417. doi: 10.1002/net.3230220407.
  • Kim, S., and H. Kim. 2016. A new metric of absolute percentage error for intermittent demand forecasts. International Journal of Forecasting 32 (3):669–79. doi: 10.1016/j.ijforecast.2015.12.003.
  • Lawler, E. L. 2001. Combinatorial optimization: Networks and matroids. New York: Dover Publications.
  • Lin, Y., and N. Xiao. 2022. Developing synthetic individual-level population datasets: The case of contextualizing maps of privacy-preserving census data. Paper presented at AutoCarto 2022, The 24th International Research Symposium on Cartography and GIScience, Redlands, CA, 2–4 November.
  • Makridakis, S. 1993. Accuracy measures: Theoretical and practical concerns. International Journal of Forecasting 9 (4):527–29.
  • Massey, D. S., M. J. White, and V.-C. Phua. 1996. The dimensions of segregation revisited. Sociological Methods & Research 25 (2):172–206. doi: 10.1177/0049124196025002002.
  • Matthews, G. J., O. Harel, and R. H. Aseltine. 2016. Privacy protection and aggregate health data: A review of tabular cell suppression methods (not) employed in public health data systems. Health Services and Outcomes Research Methodology 16 (4):258–70. doi: 10.1007/s10742-016-0162-8.
  • Mervis, J. 2019a. Privacy concerns could derail Facebook data-sharing plan. Science 365 (6460):1360–61. doi: 10.1126/science.365.6460.1360.
  • Mervis, J. 2019b. Researchers object to census privacy measure. Science 363 (6423):114. doi: 10.1126/science.363.6423.114.
  • Munkres, J. 1957. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 5 (1):32–38. doi: 10.1137/0105003.
  • Murakami, T., and Y. Kawamoto. 2019. {Utility-optimized} local differential privacy mechanisms for distribution estimation. Paper presented at the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, 14–16 August.
  • Murray, A. T. 2005. Geography in coverage modeling: Exploiting spatial structure to address complementary partial service of areas. Annals of the Association of American Geographers 95 (4):761–72. doi: 10.1111/j.1467-8306.2005.00485.x.
  • Murray, A. T., M. E. O’Kelly, and R. L. Church. 2008. Regional service coverage modeling. Computers & Operations Research 35 (2):339–55. doi: 10.1016/j.cor.2006.03.004.
  • Santos-Lozada, A. R., J. T. Howard, and A. M. Verdery. 2020. How differential privacy will affect our understanding of health disparities in the United States. Proceedings of the National Academy of Sciences of the United States of America 117 (24):13405–12. doi: 10.1073/pnas.2003714117.
  • Sei, Y., and A. Ohsuga. 2017. Differential private data collection and analysis based on randomized multiple dummies for untrusted mobile crowdsensing. IEEE Transactions on Information Forensics and Security 12 (4):926–39. doi: 10.1109/TIFS.2016.2632069.
  • Sei, Y., H. Okumura, and A. Ohsuga. 2022. Re-identification in differentially private incomplete datasets. IEEE Open Journal of the Computer Society 3:62–72. doi: 10.1109/OJCS.2022.3175999.
  • Shlomo, N. 2007. Statistical disclosure control methods for census frequency tables. International Statistical Review 75 (2):199–217. doi: 10.1111/j.1751-5823.2007.00010.x.
  • Shlomo, N. 2014. Probabilistic record linkage for disclosure risk assessment. In International Conference on Privacy in Statistical Databases, ed. J. Domingo-Ferrer, vol. 8744, 269–82. Berling, Heidelberg: Springer.
  • Shlomo, N., C. Tudor, and P. Groom. 2010. Data swapping for protecting census tables. In International Conference on Privacy in Statistical Databases, ed. J. Domingo-Ferrer and E. Magkos, vol. 6344, 41–51. Berlin, Heidelberg: Springer.
  • Shlomo, N., and C. Young. 2006. Statistical disclosure control methods through a risk-utility framework. In International Conference on Privacy in Statistical Databases, ed. J. Domingo-Ferrer and L. Franconi, vol. 4302, 68–81. Berlin, Heidelberg: Springer.
  • Skinner, C., and N. Shlomo. 2008. Assessing identification risk in survey microdata using log-linear models. Journal of the American Statistical Association 103 (483):989–1001. doi: 10.1198/016214507000001328.
  • Soria-Comas, J., J. Domingo-Ferrer, D. Sánchez, and S. Martínez. 2014. Enhancing data utility in differential privacy via microaggregation-based k-anonymity. The VLDB Journal 23 (5):771–94. doi: 10.1007/s00778-014-0351-4.
  • Soria-Comas, J., J. Domingo-Ferrer, D. Sánchez, and D. Megías. 2017. Individual differential privacy: A utility-preserving formulation of differential privacy guarantees. IEEE Transactions on Information Forensics and Security 12 (6):1418–29. doi: 10.1109/TIFS.2017.2663337.
  • Sweeney, L. 2000. Simple demographics often identify people uniquely. Working paper. http://dataprivacylab.org/projects/identifiability/.
  • Tiwari, C., K. Beyer, and G. Rushton. 2014. The impact of data suppression on local mortality rates: The case of CDC WONDER. American Journal of Public Health 104 (8):1386–88. doi: 10.2105/AJPH.2014.301900.
  • Tong, D., A. Murray, and N. Xiao. 2009. Heuristics in spatial analysis: A genetic algorithm for coverage maximization. Annals of the Association of American Geographers 99 (4):698–711.
  • Van Riper, D., T. Kugler, and S. Ruggles. 2020. Disclosure avoidance in the Census Bureau’s 2010 demonstration data product. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12276, 353–68.
  • Van Riper, D., T. Kugler, and J. Schroeder. 2020. IPUMS NHGIS Privacy-Protected 2010 Census Demonstration Data, version 20210608 [database]. https://www.nhgis.org/privacy-protected-2010-census-demonstration-data.
  • Wang, C.-C., and H.-Y. Wang. 2017. Assessment of the compressive strength of recycled waste LCD glass concrete using the ultrasonic pulse velocity. Construction and Building Materials 137:345–53.
  • Wieland, S. C., C. A. Cassa, K. D. Mandl, and B. Berger. 2008. Revealing the spatial distribution of a disease while preserving privacy. Proceedings of the National Academy of Sciences 105 (46):17608–13.
  • Winkler, R. L., J. L. Butler, K. J. Curtis, and D. Egan-Robertson. 2021. Differential privacy and the accuracy of county-level net migration estimates. Population Research and Policy Review 41:417–35.
  • Xiao, N., D. A. Bennett, and M. P. Armstrong. 2007. Interactive evolutionary approaches to multiobjective spatial decision making: A synthetic review. Computers, Environment and Urban Systems 31 (3):232–52.
  • Xiao, N., and A. T. Murray. 2019. Spatial optimization for land acquisition problems: A review of models, solution methods, and GIS support. Transactions in GIS 23 (4):645–71.
  • Young, C., D. Martin, and C. Skinner. 2009. Geographically intelligent disclosure control for flexible aggregation of census data. International Journal of Geographical Information Science 23 (4):457–82.