367
Views
2
CrossRef citations to date
0
Altmetric
Measurement, Statistics, and Research Design

A Comparison of Bias Reduction Methods: Clustering versus Propensity Score Subclassification and Weighting

(Senior Assistants Professor in Statistics for Economics) ORCID Icon, (Associate Professor of Business Statistics and Data Mining) & (Associate Lecturer at Department of Educational and Human Sciences)
Pages 33-54 | Published online: 29 Nov 2017
 

ABSTRACT

Propensity score (PS) adjustments have become popular methods used to improve estimates of treatment effects in quasi-experiments. Although researchers continue to develop PS methods, other procedures can also be effective in reducing selection bias. One of these uses clustering to create balanced groups. However, the success of this new method depends on its efficacy compared to that of the existing methods. Therefore, this comparative study used experimental and nonexperimental data to examine bias reduction, case retention, and covariate balance in the clustering method, PS subclassification, and PS weighting. In general, results suggest that the cluster-based methods reduced at least as much bias as the PS methods. Under certain conditions, the PS methods reduced more bias than the cluster-based method, and under other conditions the cluster-based methods were more advantageous. Although all methods were equally effective in retaining cases and balancing covariates, other data-specific conditions may likely favor the use of a cluster-based approach.

Notes

1 Here, the term inertia is used in analogy with applied mathematics, where the “moment of inertia” is the integral of mass times the squared distance to the centroid (Greenacre, Citation1984, p. 35).

2 We limited the analysis to a 20-cluster partition because a partition with more than 20 clusters might not have sense from a practitioner's point of view.

3 The standard errors are computer using the formula in Zanutto (Citation2006)

4 For an in-depth analysis of unsupervised learning techniques to identify natural group structures underlying the data, see, for example Everitt (Citation1993), Everitt et al. (Citation2011), Duda et al. (Citation2001), and Hastie et al. (Citation2001).

5 Unstructured data do not have an underlying predefined data model, as it is not organized in a predefined manner.

6 Tonks (2009) provided a discussion of segment design and the choice of clustering variables in consumer markets.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 169.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.