225
Views
6
CrossRef citations to date
0
Altmetric
Original Articles

Simultaneous edit-imputation and disclosure limitation for business establishment data

, &
Pages 63-82 | Received 18 May 2016, Accepted 03 Nov 2016, Published online: 15 Dec 2016

References

  • J.M. Abowd, M. Stinson, and G. Benedetto, Final report to the Social Security Administration on the SIPP/SSA/IRS public use file project, Tech. Rep., U.S. Census Bureau Longitudinal Employer-Household Dynamics Program, 2006. Available at https://protect-us.mimecast.com/s/qOqXBRFNYZbOUq?domain=www2.vrdc.cornell.edu, 2016-05-17 18:46:20 +0000.
  • J.M. Abowd and S.D. Woodcock, Disclosure limitation in longitudinal linked data, in Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, P. Doyle, J. Lane, L. Zayatz, and J. Theeuwes, eds., North-Holland, Amsterdam, 2001, pp. 215–277.
  • L.H. Cox, A.F. Karr, and S.K. Kinney, Risk-utility paradigms for statistical disclosure limitation: How to think, but not how to act, Int. Stat. Rev. 79 (2011), pp. 160–183. doi: 10.1111/j.1751-5823.2011.00140.x
  • L.H. Cox and L.V. Zayatz, An agenda for research in statistical disclosure limitation, Research Report LVZ93/01, Statistical Research Division, U.S. Bureau of the Census, Washington, DC, 1993.
  • T. de Waal, J. Pannekoek, and S. Scholtus, Handbook of Statistical Data Editing and Imputation, John Wiley & Sons, Hoboken, NJ, 2011.
  • J. Domingo-Ferrer, J.M. Mateo-Sanz, and V. Torra, Comparing SDC Methods for Microdata on the Basis of Information Loss and Disclosure Risk, Pre-proceedings of ENK-NTTS, Luxemburg, 2001, pp. 807–826.
  • L.R. Draper and W.E. Winkler, Balancing and ratio editing with the new SPEER system, Research Report RR97/05, Statistical Research Division, U.S. Bureau of the Census, Washington, DC, 1997.
  • J. Drechsler, Synthetic Datasets for Statistical Disclosure Control: Theory and Implementation, Vol. 201, Springer Science & Business Media, New York, 2011.
  • J. Drechsler, New data dissemination approaches in old Europe – Synthetic datasets for a German establishment survey, J. Appl. Stat. 39 (2012), pp. 243–265. doi: 10.1080/02664763.2011.584523
  • J. Drechsler, S. Bender, and S. Rässler, Comparing fully and partially synthetic datasets for statistical disclosure control in the German IAB Establishment Panel, Trans. Data Privacy 1 (2008), pp. 1002–1050.
  • J. Drechsler, A. Dundler, S. Bender, S. Rässler, and T. Zwick, A new approach for disclosure control in the IAB establishment panel – Multiple imputation for a better data access, Adv. Stat. Anal. 92 (2008), pp. 439–458. doi: 10.1007/s10182-008-0090-1
  • J. Drechsler and J.P. Reiter, Sampling with synthesis: A new approach for releasing public use census microdata, J. Am. Stat. Assoc. 105 (2010), pp. 1347–1357. doi: 10.1198/jasa.2010.ap09480
  • J. Drechsler and J.P. Reiter, An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets, Comput. Stat. Data Anal. 55 (2011), pp. 3232–3243. doi: 10.1016/j.csda.2011.06.006
  • T. Evans, L. Zayatz, and J. Slanta, Using noise for disclosure limitation of establishment tabular data, J. Off. Stat. 14 (1998), pp. 537–551.
  • I.P. Fellegi and D. Holt, A systematic approach to automatic edit and imputation, J. Am. Stat. Assoc. 71 (1976), pp. 17–35. doi: 10.1080/01621459.1976.10481472
  • L. Foster, J. Haltiwanger, and C. Syverson, Reallocation, firm turnover, and efficiency: Selection on productivity or profitability? Am. Econ. Rev. 98 (2008), pp. 394–425. doi: 10.1257/aer.98.1.394
  • M.M. Garcia and R. Goodwin, Developing SAS software for generating a complete set of ratio edits, Research Report RRS2002/06, Statistical Research Division, U.S. Bureau of the Census, Washington, DC, 2002.
  • B.G. Greenberg and R. Surdi, A flexible and interactive edit and imputation system for ratio edits, Research Report RR84/18, Statistical Research Division, U.S. Bureau of the Census, Washington, DC, 1984.
  • S. Hawala, Producing partially synthetic data to avoid disclosure, in Proceedings of the Joint Statistical Meetings, American Statistical Association, Alexandria, VA, 2008.
  • A. Hundepool, J. Domingo-Ferrer, L. Franconi, S. Giessing, E.S. Nordholt, K. Spicer, and P.-P. de Wolf, Statistical Disclosure Control, John Wiley & Sons, West Sussex, UK, 2012.
  • H. Ishwaran and L.F. James, Gibbs sampling methods for stick-breaking priors, J. Am. Stat. Assoc. 96 (2001), pp. 161–173. doi: 10.1198/016214501750332758
  • H.J. Kim, J.P. Reiter, Q. Wang, L.H. Cox, and A.F. Karr, Multiple imputation of missing or faulty values under linear constraints, J. Bus. Econ. Stat. 32 (2014), pp. 375–386. doi: 10.1080/07350015.2014.885435
  • H.J. Kim, J.P. Reiter, Q. Wang, L.H. Cox, and A.F. Karr, Simultaneous edit-imputation for continuous microdata, J. Am. Stat. Assoc. 110 (2015), pp. 987–999. doi: 10.1080/01621459.2015.1040881
  • S.K. Kinney, J.P. Reiter, and J. Miranda, SynLBD 2.0: Improving the synthetic longitudinal business database, Stat. J. IAOS 30 (2014), pp. 129–135.
  • S.K. Kinney, J.P. Reiter, A.P. Reznek, J. Miranda, R.S. Jarmin, and J.M. Abowd, Towards unrestricted public use business microdata: The synthetic longitudinal business database, Int. Stat. Rev. 79 (2011), pp. 362–384. doi: 10.1111/j.1751-5823.2011.00153.x
  • R.J.A. Little, Statistical analysis of masked data, J. Off. Stat. 9 (1993), pp. 407–426.
  • R.J.A. Little and P.J. Smith, Editing and imputation for quantitative survey data, J. Am. Stat. Assoc. 82 (1987), pp. 58–68. doi: 10.1080/01621459.1987.10478391
  • A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber, Privacy: Theory Meets Practice on the Map, IEEE 24th International Conference on Data Engineering, Istanbul, 2008, pp. 277–286.
  • J.P. Reiter, Simultaneous use of multiple imputation for missing data and disclosure limitation, Surv. Methodol. 30 (2004), pp. 235–242.
  • J.P. Reiter, Releasing multiply imputed, synthetic public use microdata: An illustration and empirical study, J. R. Stat. Soc. Ser. A 168 (2005), pp. 185–205. doi: 10.1111/j.1467-985X.2004.00343.x
  • J.P. Reiter and T.E. Raghunathan, The multiple adaptations of multiple imputation, J. Am. Stat. Assoc. 102 (2007), pp. 1462–1471. doi: 10.1198/016214507000000932
  • D.B. Rubin, Inference and missing data, Biometrika 63 (1976), pp. 581–592. doi: 10.1093/biomet/63.3.581
  • D.B. Rubin, Multiple Imputation for Nonresponse in Surveys, John Wiley & Sons, Hoboken, NJ, 1987.
  • D.B. Rubin, Statistical disclosure limitation, J. Off. Stat. 9 (1993), pp. 461–468.
  • J.W. Sakshaug and T.E. Raghunathan, Generating synthetic data to produce public-use microdata for small geographic areas based on complex sample survey data with application to the National Health Interview Survey, J. Appl. Stat. 41 (2014), pp. 2103–2122. doi: 10.1080/02664763.2014.909778
  • J. Sethuraman, A constructive definition of Dirichlet priors, Stat. Sin. 4 (1994), pp. 639–650.
  • Y. Si and J.P. Reiter, A comparison of posterior simulation and inference by combining rules for multiple imputation, J. Stat. Theory Pract. 5 (2011), pp. 335–347. doi: 10.1080/15598608.2011.10412032
  • C. Syverson, Product substitutability and productivity dispersion, Rev. Econ. Stat. 86 (2004), pp. 534–550. doi: 10.1162/003465304323031094
  • C. Syverson, What determines productivity? J. Econ. Lit. 49 (2011), pp. 326–365. doi: 10.1257/jel.49.2.326
  • K.J. Thompson and S.A. Adeshiyan, Data quality effects of alternative edit parameters, J. Data Sci. 1 (2003), pp. 83–101.
  • K.J. Thompson, J.T. Fagan, B.L. Yarbrough, and D.L. Hambric, Using a Quadratic Programming Approach to Solve Simultaneous Ratio and Balance Edit Problems, Proceedings of the Survey Research Methods Section, American Statistical Association, Alexandria, VA, 2004, pp. 4485–4490.
  • M. Trottini, K. Muralidhar, and R. Sarathy, An investigation of model-based microdata masking for magnitude tabular data release, in Privacy in Statistical Databases, J. Domingo-Ferrer and I. Tinnirello, eds., Springer, Berlin, 2012, pp. 47–62.
  • H. Wang and J.P. Reiter, Multiple imputation for sharing precise geographies in public use data, Ann. Appl. Stat. 6 (2012), pp. 229–252. doi: 10.1214/11-AOAS506
  • T.K. White, J.P. Reiter, and A. Petrin, Plant-level productivity and imputation of missing data in US Census manufacturing data, NBER Working Paper 17816, National Bureau of Economic Research, Cambridge, MA, 2012.
  • L. Willenborg and T. de Waal, Elements of Statistical Disclosure Control, Springer-Verlag, New York, 2001.
  • W.E. Winkler and L.R. Draper, Application of the SPEER edit system, Research Report RR96/02, Statistical Research Division, U.S. Bureau of the Census, Washington, DC, 1996.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.