77
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Five Diagnostic Tests for Unobserved Cluster Effects

&
Pages 1212-1227 | Received 27 Jan 2010, Accepted 23 Apr 2010, Published online: 28 Jun 2010
 

Abstract

This article compares two recently proposed test statistics for unobserved cluster effects (C, SSR w ) with three statistics frequently mentioned in panel econometrics (BP, SLM, F). Simulations include data generating processes with a cluster-level explanatory variable, scenarios with unequally sized clusters, processes that have an incorrectly specified cluster structure, and processes that have no cluster structure but rather spatial correlation. All but the F test exhibit small-sample deviation from the asymptotic distribution. The SLM, F, and SSR w tests show equivalent power when cluster sizes are balanced. SLM has greatest power when cluster sizes are unbalanced.

Mathematics Subject Classification:

Acknowledgment

We gratefully acknowledge the insightful comments of an anonymous referee for this journal.

Notes

This is because C may be written as where , , and , and because the expected value of equals 0 under H 0 – or, at least it is when we insert disturbances u gi instead of residuals û gi into r g .

In this sense, the SSR w test is related to Amemiya (Citation1978).

Since with , it becomes clear that SLM and C are related, as they both are a transformation of d: C = 0.5A 1(d − 1)/A 2 with A 2 representing the denominator in Eq. (Equation3). But conditional on W and the cluster structure, SLM is a simple linear function of d, whereas the relation between C and d still depends on the outcomes of û through A 1 and A 2

Although we view μ g as a random effect, its importance may also be evaluated as if it were a fixed effect.

Note that the degrees of freedom in the numerator is not equal to G − 1, as it is in Greene (Citation2008), because under the null hypothesis of absent fixed effects the cluster-level variables Z g appear in the model. Thus, under the null, the model contains G − 1 − K z fewer slopes than under the alternative. In this context, note that Orme and Yamagata (Citation2006) proved asymptotic equivalence between F and the unstandardized LM test of Honda (Citation1985) for G going to infinity, balanced cluster sizes and no cluster-level explanatory variables.

In the context of serial correlation, this property is also noted by Durbin and Watson (Citation1950). Furthermore, under the null hypothesis of no clustering, SSR w and F are not affected either. SSR w uses fixed effects in its computation, and F uses residuals from an OLS and a fixed effects regression. In the absence of clustering, the fixed effects capture the combined effect of Z-variables perfectly. In the presence of clustering, the fixed effects also capture the cluster-level disturbance, which causes SSR w and F to lose the invariance property—but Monte Carlo experiments that are not further reported here show a drop in power of at most only a few percentage points.

Notes: aThe χ2 goodness-of-fit test value indicates rejection at the 10% significance level of the notion that the test statistic follows its asymptotic distribution.

bReporting the number of rejections of the null hypothesis of no cluster effects among 1,000 simulated runs. This should equal roughly 50 if the actual size of the test equals the nominal size.

An appendix with experiments for additional combinations of (G, N g ) and different data configurations is available upon request.

i.e., with E b and O b being the expected and observed frequency in bin b. Bins vary in width.

As implied in footnote 1, the limiting distribution of C relies on û being consistent estimators of u and converging to . Apparently, convergence does not happen quickly.

Notes: aThe χ2 goodness-of-fit test value indicates rejection at the 10% significance level of the notion that the test statistic follows its asymptotic distribution.

bReporting the number of rejections of the null hypothesis of no cluster effects among 1,000 simulated runs. This should equal roughly 50 if the actual size of the test equals the nominal size.

By Ahrens and Pincus’ (Citation1981) measure of unbalancedness , unbalancedness equals approximately 1.00, 0.86, 0.57, 0.31 and 0.12, respectively, for the five rows of each block of Table .

Notes: aMeasured by the number of rejections of the null hypothesis among 1,000 iterations. Rows for values of ρ where the number of rejections equal 1,000 throughout are omitted.

, .

For example, a sample with 1,000 observations that were given identical values of v from one configuration to the next was divided into G = 10, 20, 50, 100, and 200 clusters with cluster sizes ranging accordingly from 100 down to 5. Among SSR w , SLM, and F, rejection rates at the 5% significance level decreased from 426 to 368, 243, 154, and 109, respectively, among 1,000 runs. Incidentally, under the same sequence of configurations, the actual size of the test under H 0 rises for C, SSR w and BP, but is roughly constant for SLM and F.

For N g  = 60, ω equals 0.377, but in unbalanced samples ω varies between clusters. In the most extreme scenario in Table , ω varies from 0.066–0.845.

This also raises another question that will be left for future research: how do tests for cluster effects perform conditional on the existence of another (lower-level or higher-level) cluster effect?

In data of the 2010 U.S. Census, Barrios et al. (Citation2010) found evidence of spatial correlation even after controlling for clustering both at the state level and at the smaller “puma” (public use microdata area) level.

For α non zero, ν gi is heteroskedastic because the number of non zero row elements in W varies with i. This makes it impossible to target an overall R 2 of this model at 0.5. Instead, as α grows, the variance of ν gi rises and the explanatory power of the model decreases a little. For example, for α = 0.5, the variance of the first term of ν gi equals 0.0125 if state i shares borders with five other states.

As measured by the standard deviation of the standard error across iterations. Expressed in a different way, relative to the average value of the standard error, the variation amounts to 17.7% for the adjusted standard error of the slope of X for the smallest sample to 6.6% in the largest, as opposed to 5.0% for the unadjusted standard error of the slope of X for the smallest sample to 1.0% in the largest. For the slope of Z, the variation amounted to 30.7% for the adjusted standard error for the smallest sample to 14.0% in the largest, as opposed to 17.6% for the unadjusted standard error for the smallest sample to 7.5% in the largest.

Given the similarity between cluster effects in cross-sectional analysis and random effect in panel data, these conclusions are relevant for panel econometrics as well.

While the primary audience for this article is the researcher who uses data with cluster features, the Monte Carlo results have obvious relevance for the closely related field of panel econometrics, where cluster effects are called unobserved random effects and the same test statistics apply. We frame the discussion in terms of clusters rather than panels mainly because the variation in cluster size is often much greater than that of panel length in data with unbalanced panels. In panel contexts, if “g” measures individuals and “i” measures time, the common factor μ g may actually be capturing serial correlation in ν gi . In Monte Carlo experiments, the test statistics proved to have power against the hypothesis of uncorrelated disturbances as well, again illustrating the point that statistically significant test results do not necessarily imply a common factor specification.

Similar to prior studies (e.g., Blanchard and Matyas, Citation1996; Moulton and Randolph, Citation1989), we explored the effect of non normal disturbances. We found that an exponential distribution makes little difference (as in previous research) but in some configurations of cluster sizes and numbers of clusters a thick-tailed distribution may reduce power substantially. We also investigated the effect of increasing the number of cluster-level explanatory variables: the gap between actual and nominal size widens slightly, and power diminishes somewhat. For details, see the Appendix that is available upon request.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.