Publication Cover
Journal of Human Development and Capabilities
A Multi-Disciplinary Journal for People-Centered Development
Volume 23, 2022 - Issue 3
413
Views
1
CrossRef citations to date
0
Altmetric
Articles

The Moral Foundations of Impact Evaluation

Pages 425-454 | Accepted 01 Dec 2021, Published online: 16 Dec 2021
 

ABSTRACT

Impact evaluation has become increasingly central to evidence-based social policy, particularly in the field of international development. While the act of evaluation requires numerous ethical decisions (e.g. regarding the problems to investigate, the tools of investigation, and the interpretation of results), the normative framework for such decisions is generally implicit, undermining our ability to fully scrutinise the evidence base. I argue that the moral foundation of impact evaluation is best viewed as utilitarian in the sense that it meets the three elementary requirements of utilitarianism: welfarism, sum-ranking, and consequentialism. I further argue that the utilitarian approach is subject to a number of important limitations, including distributional indifference, the neglect of non-utility concerns, and an orientation toward subjective states. In light of these issues, I outline an alternative framework for impact evaluation that has its moral basis in the capabilities approach. I argue that capabilitarian impact evaluation not only addresses many of the issues associated with utilitarian methods, but can also be viewed as a more general approach to impact evaluation.

Acknowledgements

I would like to thank the editor and two anonymous referees for their thoughtful comments. I would further like to thank William Boal, Lendie Follett, Jennifer McCrickerd, Alice Nicole Sindzingre, and Brian Vander Naald for their feedback on early drafts of this paper.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 For further discussion on the rise of RCTs, see Ravallion (Citation2018), Webber and Prouse (Citation2018), or Heckman (Citation2020).

2 To cite a specific example, Banerjee (Citation2006) states that “The beauty of randomised evaluations is that the results are what they are: we compare the outcome in the treatment with the outcome in the control group, see whether they are different, and if so by how much”. Such value-neutrality is also evident in CitationDuflo's (Citation2017) account of the economist as a “plumber” who is “more concerned about ‘how’ to do things than about ‘what’ to do” (pg. 3).

3 See Reiss (Citation2013) or Hausman, McPherson, and Satz (Citation2016) for a more general discussion of how normative questions are often implicated in policy research.

4 See Sen and Williams (Citation1982) for further discussion on the distinction between private morality and public choice.

5 Following Sen (Citation2000a), a simple example may help illustrate the distinction between functionings and capabilities. Consider two people, one of which is fasting while the other is starving. These two people may have identical functionings in that they are similarly deficient in terms of being well-fed. However, they have fundamentally different capabilities in that the person who is fasting has the opportunity to be better fed, whereas the person who is starving lacks that same opportunity. It is when we look at capabilities that we come to understand that the person who is fasting is better off than the person who is starving.

6 According to White (Citation2010), there are two distinct definitions of impact evaluation. The first encompasses any analysis of the final level of the causal chain of a program (i.e. well-being) and does not necessarily entail counterfactual analysis. The second emphasises rigorous counterfactual analysis in terms of estimating the difference between some indicator of interest with and without the intervention. The latter definition does not necessarily focus on the final level of the causal chain and includes analyses of indicators directly influenced by the intervention (e.g. outputs). As current practice emphasises the overlap between these two definitions, I have defined impact evaluation accordingly.

7 It is common practice to maintain SUTVA, as it rules out complicating factors like externalities and general equilibrium effects. See Imbens and Wooldridge (Citation2009), Gertler et al. (Citation2016), or Cunningham (Citation2021) for detailed discussions.

8 Another common causal estimand is the average treatment effect (ATE), which is defined as E[Y(1)]E[Y(0)]. For the sake of exposition, I will focus on the ATT in the rest of the paper.

9 Similar derivations can be found in Duflo, Glennerster, and Kremer (Citation2007), Khandker, Koolwal, and Samad (Citation2010), and Cunningham (Citation2021). The reader is referred to those works for additional details.

10 Both endogeneity and selection bias can be conceptualised as a form of omitted variable bias.

11 The assumption of selection on observables has a variety of names in the literature, including unconfoundedness and ignorability.

12 See Khandker, Koolwal, and Samad (Citation2010), Gertler et al. (Citation2016), or Cunningham (Citation2021) for further discussion of the DD estimator. Also note that conventional standard errors in the context of DD may understate the standard deviation of treatment effects. Bertrand, Duflo, and Mullainathan (Citation2004) suggest some alternative approaches (e.g. block bootstrap) that yield consistent estimates.

13 I have abstracted here from the presence of additional covariates. Also note that the standard errors from this procedure will generally be incorrect and thus need to be adjusted.

14 For detailed reviews of the impact evaluation literature on particular topics, see Kremer and Holla (Citation2009) on education interventions, Duflo (Citation2012) on women's empowerment, Banerjee, Karlan, and Zinman (Citation2015) on microcredit, and Bastagli et al. (Citation2019) on cash transfers, to name a few.

15 Note that the requirements of utilitarianism are independent. For example, welfarism can be combined with non-consequentialist moralities and consequentialism can be combined with non-welfarist theories of value.

16 For example, unconditional cash-transfer schemes often have explicit objectives for improving health, education, nutrition outcomes, etc. Without conditionality, the achievement of those objectives requires that beneficiaries value the objectives and will pursue them when resource constraints are relaxed.

17 Harsanyi (Citation1977b) argues that, even in situations where population growth is a relevant concern, average utility is a more defensible maximand: in choosing between two policies or programs, a self-interested individual would rationally choose the one that maximises average utility if they did not know their position in the resulting state of affairs.

18 One might claim that even non-consequentialist theories indirectly care about consequences to the extent that such evidence is necessary to infer the occurrence of a non-consequential immoral act. The fact nevertheless remains that impact evaluations emphasise particular forms of evidence about consequences (i.e. the ATT or ATE), which are differentially useful to alternative moral theories.

19 Duflo, Kremer, and Robinson (Citation2011), Attanasio, Costas, and Santiago (Citation2012), and Blattman, Fiala, and Martinez (Citation2014) represent examples of evaluations that include a theoretical component. See Duflo, Glennerster, and Kremer (Citation2007) for a more general discussion of the role of theory in impact evaluation.

20 Williams (Citation2020) distinguishes between generalisability and applicability. Generalisability refers to whether the results from one evaluation are likely to hold in other (unspecified) contexts whereas applicability refers to whether results from one or more other contexts will hold for a specific context. Stated differently, generalisability is concerned with generating results or rules that are robust across contexts and applicability is concerned with informing acts within a particular context. Williams notes that much of the research in this area emphasises generalisability rather than applicability, thus substantiating the claim that much of the literature can be viewed as an attempt to avoid the collapse into act consequentialism.

21 The critiques of utilitarianism are numerous and this is by no means an exhaustive list. For an accessible overview of objections to utilitarianism, see de Lazari-Radek and Singer (Citation2017).

22 See Holtug (Citation2015) for elaboration on these definitions of egalitarianism and prioritarianism.

23 Egalitarianism and prioritarianism encompass a wide variety of views regarding the appropriate “currency” of justice (Holtug Citation2015). To remain consistent with my definition of impact evaluation, I focus attention on well-being.

24 For example, for program A, we have [(21)+(22)+(23)]/3=0. It is worth recalling that in practice one does not know the individual-level impacts and would instead calculate the ATT using one of the methods discussed in Section “Impact Evaluation in Theory and Practice”. Knowledge of the counterfactual outcomes here simply facilitates assessing the distributional consequences of the three program variants.

25 The utilitarian could respond to this critique by trying to examine the distributional consequences of programs by relying on the CATE to conduct sub-group analyses. The same critiques would nevertheless be valid within each sub-group given that CATE is still a form of sum-ranking. See Subramanian, Kim, and Christakis (Citation2018) for a similar argument.

26 For additional discussion, see Scarre (Citation2002), de Lazari-Radek and Singer (Citation2017), or White (Citation2019).

27 Stated differently, program B altered the chosen element, but not the set itself. Conversely, program C expanded the freedom to choose for individual 3, though they elected not to realise any of the newly available alternatives. These alternatives nevertheless remain available to individual 3 in the future.

28 For additional discussion of interpersonal comparisons of utility, see Binmore (Citation2009), Reiss (Citation2013), or Hausman, McPherson, and Satz (Citation2016).

29 Here reversibility means that removing program C brings individual 1 back to a state where they preferred no intervention.

30 See Elster (Citation1982) for additional discussion.

31 For accessible introductions to the CA, see Sen (Citation2000a), Robeyns (Citation2005), or Nussbaum (Citation2011).

32 See Sen (Citation1993) or Sen (Citation2000a) for additional discussion.

33 See Robeyns (Citation2005) for a particularly detailed discussion of the distinction between well-being and agency.

34 Note that the concepts of functionings and capabilities relate most closely to well-being achievements and freedoms, respectively.

35 See Anand and van Hees (Citation2006) and Anand et al. (Citation2009) on direct elicitation, Krishnakumar (Citation2007) and Krishnakumar and Chávez-Juárez (Citation2016) on structural equation models, and Andreassen and Di Tommaso (Citation2018) on random utility models. It is worth noting that Alkire (Citation2002) developed a method for capabilities-based policy assessment that relies on direct elicitation to gauge perceived changes. The method, however, does not attempt to identify a proper counterfactual and is therefore not directly applicable to impact evaluation. As a result, I will not discuss it in detail here.

36 Sen (Citation1992) also uses the term “potentials” to refer to the highest level of some functioning an individual could achieve. He, however, uses the term “shortfall” to refer to what I call “deviations from potentials”.

37 See Henderson and Follett (Citation2020) for more technical discussion.

38 Throughout I will assume that functionings are positively-valued in the sense that higher values are “better”. Following Sen (Citation2000a), personal and environmental characteristics may include features related to personal heterogeneities (e.g. gender), environmental diversities (e.g. climate), variations in social climate (e.g. economic institutions), differences in relational perspectives (e.g. conventions regarding the material requirements for social inclusion), or the distribution of resources within the family. While the framework permits different vectors of characteristics to be specified for potentials and deviations from potentials, I will abstract from such considerations here.

39 See Follett and Henderson (Citation2020) for technical discussion of causal inference in the context of capability sets.

40 That is, the distribution of simulated or predicted values represents the alternative functionings the individual could have achieved, as implied by the model's parameter estimates.

41 See Follett and Henderson (Citation2020) for derivations and technical discussion.

42 That is, points A and C represent factual average outcomes and points B and D represent counterfactual average outcomes.

43 With respect to the case pertaining to points A and B, we see that the expansion of potentials is associated with a reduction in deviations from potentials (i.e. treatment brought treated individuals closer to the frontier). For the case pertaining to points C and D, we see that treatment brought treated individuals further away from the frontier. In either case, the capabilities and choice effects exactly decompose the overall effect.

44 See Barclay (Citation2016) for additional discussion of this issue.

45 See Nussbaum (Citation1997), Osmani (Citation2005), or Vizard, Fukuda-Parr, and Elson (Citation2011) for additional discussion on the relationship between the CA and human rights.

Additional information

Notes on contributors

Heath Henderson

Heath Henderson is an associate professor of economics in the College of Business and Public Administration at Drake University. He attended graduate school at American University in Washington, DC where he received a PhD in economics and an MA in international politics.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.