ABSTRACT
Empirical research relying on inputs from published company financial statements ignore the fact that the observed accounting data matrix has been purposefully designed to be rank deficient by means of articulation between stocks and flows. This inherent feature of the data-generating process suggests structural non-identification when both stocks and flows appear in the design matrix and a constraint is required to identify parameters. Much financial research has fallen into this ‘accounting identity trap’ and routinely employs implicit constraints to enable estimation, albeit without acknowledgement of the constraints hence the misleading inferences. This article elucidates the problem of parameter identification under stock-and-flow rank deficiency using existing applications on equity pricing. The focus is on the interpretation of slope coefficients that must be anchored on economically defensible parameter constraints.
Acknowledgements
I am grateful for the on-going support from the MEAFA research group, http://sydney.edu.au/business/research/meafa. I am thankful for the beneficial discussions with Robert Bartels, Jeffrey Callen, Colin Cameron, David Drukker, Peter Easton, Richard Gerlach, William Greene, Bjorn Jorgensen, James Ohlson, Stephen Penman and Artem Prokhorov.
Disclosure statement
No potential conflict of interest was reported by the author.
Notes
1 The articulated set of financial statements reconciles the Opening Statement of Financial Position, the Comprehensive Income Statement and the Cash Flow Statement, reconciled with the Closing Statement of Financial Position.
2 The reconciliation may only be temporarily violated under equity restructure transactions.
3 Rank deficiency is sometimes described in terms of perfect collinearity but this would be a misnomer here because it implies a data-driven nuisance, whereas rank deficiency makes it clear that the problem faced with accounting identities is due to structural non-identification.
4 The RLS solution can also be achieved through the application of generalized inverses, such as the Moore-Penrose pseudoinverse (Mazumdar, Li, and Bryce Citation1980).
5 The rank deficiency of mutually exclusive binary variables in combination with the model intercept is also known as the `dummy variable trap’. Virtually all textbooks cover the intercept result but, to my knowledge, only Kmenta (Citation1997, ch.11) and Greene (Citation2011, ch.5) discuss potential applications to slope coefficients.
6 It is also worth noting the Sweeney and Ulveling (Citation1972) and Kennedy (Citation1986) alternative to solving for
where
is the corresponding proportion of the sample allocated to each binary group, and
is the deviation from the global unweighted average.
7 With thanks to William Greene (New York University) for elucidating this key point.
8 Barth et al. (Citation1999) and Barth et al. (Citation2005) also add to the explanatory part other accounting flow variables that are directly linked via an articulate relation to net earnings (e.g. accruals and operating cash flows which add to net earnings), hence imposing further implicit parameter constrains and confounding the interpretation problem even further.
9 A multivariate function is homogeneous of degree if when each one of its elements is multiplied by the same constant then the value of the multilinear function is multiplied by that constant to the power of
. Euler’s theorem on homogeneous functions conditions that a function is homogeneous of degree
if and only if
, where
is the partial derivative of every variable
and
is the function of all
. If the full model of Equation (8) is homogeneous then it applies that
, where each partial derivative is the respective slope coefficient in Equation (8) (for brevity only
is used in this note).
10 The exact empirical application by Amir (Citation1993) includes other potentially value-relevant items in the explanatory part, but to focus the analysis the problem is reduced here to only those variables that are derived from the clean surplus relation.
11 The OLS accumulated cross-products are:. The subtraction of the cross-products of the first equation from the second equation gives the third equation, as in
.
12 The exact spatial position of the line of exactly identified solutions depends on the data. For an illustrative application, also see O’Brien (Citation2012).