Abstract
There is an on-going debate about variable selection in data envelopment analysis (DEA) as there are no diagnostic checks for model misspecification. This paper contributes to this debate by investigating the sensitivity of DEA efficiency estimates to including inappropriate and/or omitting several important variables in a large-sample DEA model. Data are simulated from constant, increasing and decreasing returns-to-scale (RS) Cobb–Douglas production processes. For constant and decreasing RS processes with irrelevant inputs, DEA tends to overestimate efficiency in almost all production units. When relevant variables are omitted, variable RS appears to be a safer option. The correct RS specification is vital when the DEA model includes irrelevant variables. The effect of omission of relevant inputs on individual production unit efficiency is more adverse compared to the inclusion of irrelevant ones.
Acknowledgements
We thank three referees for their constructive comments that improved the presentation of this paper. The comments of the participants of the 2001 Econometric Society Australasian Meeting, Auckland, New Zealand on an earlier version of this paper are acknowledged.