ABSTRACT
According to the ‘expertise defence’, experimental findings suggesting that intuitive judgments about hypothetical cases are influenced by philosophically irrelevant factors do not undermine their evidential use in (moral) philosophy. This defence assumes that philosophical experts are unlikely to be influenced by irrelevant factors. We discuss relevant findings from experimental metaphilosophy that largely tell against this assumption. To advance the debate, we present the most comprehensive experimental study of intuitive expertise in ethics to date, which tests five well-known biases of judgment and decision-making among expert ethicists and laypeople. We found that even expert ethicists are affected by some of these biases, but also that they enjoy a slight advantage over laypeople in some cases. We discuss the implications of these results for the expertise defence, and conclude that they still do not support the defence as it is typically presented in (moral) philosophy.
Notes
1 In recent metaphilosophy, there has been some controversy about whether it is in fact standard philosophical practice to appeal to intuitions about hypothetical cases (cf. Cappelen [Citation2012], Deutsch [Citation2015], Horvath and Koch [Citation2021], and Horvath [Citationmanuscript]). For the purposes of this paper, we will ignore this strand of the metaphilosophical debate and simply assume that appealing to intuitions about cases as evidence is indeed standard philosophical practice—as continues to be the mainstream view in contemporary analytic philosophy.
2 For a survey of empirical studies of intuitive expertise more generally, see Kahneman and Klein [Citation2009].
3 Further studies that bear on the expertise defence less directly are Sytsma and Machery [Citation2010], Carter et al. [Citation2016], Beebe and Monaghan [Citation2018], and Carter et al. [Citation2019].
4 These are order effects [Schwitzgebel and Cushman Citation2012, Citation2015; Wiegmann et al. Citation2020], ‘Asian disease’ framing [Schwitzgebel and Cushman Citation2015], actor-observer bias [Tobia, Buckwalter, and Stich Citation2013; Tobia, Chapman, and Stich Citation2013; Löhr Citation2019], cleanliness priming [Tobia, Chapman, and Stich Citation2013], and irrelevant additional options [Wiegmann et al. Citation2020].
5 The main reasons for this high exclusion rate were not having a Ph.D. or M.A. in philosophy (639 participants), and not having moral philosophy/ethics as an area of specialization or competence (563 participants).
6 12% of expert, and 19% of lay, participants failed to answer both attention checks correctly.
7 In line with our preregistration, lay participants were recruited until we reached the number of valid expert participants at the end of May 2019.
8 This order of presentation was chosen in order to minimize potential order effects. For instance, it has been shown that presenting a ‘push-type’ dilemma can strongly influence the evaluation of subsequent scenarios. This is why we presented Decoy, which includes a push-type option, as the final scenario. Moreover, the two structurally similar scenarios in which a threat can be redirected by the agent (Focus and Perspective) were not presented one after the other, but rather with two scenarios in between (Prospect and Accounting).
9 A linear mixed model analysis with participant and framing effect as random effect led to the same results—as expected, given the specifics of our design (balanced, no missing values, etc.). This equivalence also holds for the following analyses.
10 Given that the gender distribution differed strongly between expert and lay participants, we also tested the interaction of gender and framing direction, which was not significant, p = .76 (i.e. the framing direction affected both genders equally).
11 The conventions for reported effect sizes are as follows: 0.01 = small, 0.06 = medium, 0.14 = large.
12 This expectation was confirmed by the significant interaction of framing direction and framing effect.
13 The distinctness of the five framing effects is also why p-values in are not adjusted for multiple comparisons.
14 This possibility obtains irrespectively of the fact that the overall interaction of framing direction, framing effect, and level of expertise was not significant (p = .064).
15 We would like to thank John Horden, Steffen Koch, Michael Vollmer, and several anonymous reviewers for very helpful comments on previous versions of this paper. We would also like to thank our audiences at EuroCogSci, Ruhr University Bochum, September 2019, at the Experimental Philosophy Conference Bern, University of Bern, September 2019, and at the Tagung für Praktische Philosophie, Panel Gedankenexperimente in der praktischen Philosophie, University of Salzburg, October 2020. Last, but not least, many thanks to Nick Byrd, Daniel Dennett, and Peter Singer for distributing our call for participation on Twitter.