1,122
Views
11
CrossRef citations to date
0
Altmetric
Original Articles

%CEM: a SAS macro to perform coarsened exact matching

ORCID Icon, & ORCID Icon
Pages 227-238 | Received 22 Dec 2015, Accepted 15 Jun 2016, Published online: 01 Jul 2016
 

ABSTRACT

In this paper we introduce %CEM, a macro package allowing researchers to automatically perform coarsened exact matching (CEM) in SAS environment. CEM is a non-parametric matching method widely used by researchers to avoid the confounding influence of pre-treatment control variables to improve causal inference in quasi-experimental studies. %CEM introduces a completely automated process which allows SAS users to efficiently perform CEM in fields in which large data sets are common and where SAS is the most popular statistical tool. In addition, such a macro may be used to test several coarsening combinations of numeric variables. This option also provides a visual representation of the matching frontier, thus enabling researchers to select the optimal setting which takes into account both the L1 imbalance and the percentage of matched units. The paper concludes with an empirical application comparing computational performance and results obtained using alternative available software (SAS, R and STATA) using multiple administrative data sets from a large regional database.

Acknowledgments

We would like to thank Stefano Iacus, Sergio Pontello, Lei Xuan and Dan Eshleman for the helpful comments and suggestions provided.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1. An alternative multidimensional balance measure called GI has recently been introduced in the literature [Citation18] with its %GI SAS macro code.

2. Speed performances were tested on a notebook with the following technical characteristics: OS Windows 7 (X64), an Intel(R) Core(TM) i5-2430M CPU Quad-Core processor running at 2.40 GHz with 4.00 GB of RAM. The following releases of the software under consideration were used: SAS 9.3. R 3.2.1 and STATA 13.

3. The bin width and – consequently – the number of bins is calculated by each software using the Scott's rule and according to its specific rounding approximation.

4. Execution time is highly dependent on the number of numerical variables. Indeed, in the case of the automatic coarsening option, adding one more variable means exponentially increasing the number of combinations between the standard coarsening options. For this reason, a priori knowledge of the research domain may lead the researcher to the optimal binning of its numerical variables, with major time savings when performing the matching procedure.

5. The Fortune 500 is the ranking of the 500 largest US Companies published by the Fortune magazine.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,209.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.