Franklin, Holmes, and the Epistemology of Computer Simulation: International Studies in the Philosophy of Science: Vol 22 , No 2

Abstract

Allan Franklin has identified a number of strategies that scientists use to build confidence in experimental results. This paper shows that Franklin’s strategies have direct analogues in the context of computer simulation and then suggests that one of his strategies—the so‐called ‘Sherlock Holmes’ strategy—deserves a privileged place within the epistemologies of experiment and simulation. In particular, it is argued that while the successful application of even several of Franklin’s other strategies (or their analogues in simulation) may not be sufficient for justified belief in results, the successful application of a slightly elaborated version of the Sherlock Holmes strategy is sufficient.

Acknowledgements

Thanks to Phil Ehrlich, Allan Franklin, Francis Longworth, John Norton, Eric Winsberg, two anonymous referees, and the editor of this journal—James McAllister—for valuable feedback on earlier versions of this paper.

Notes

[1] In this paper, attention is restricted to computer simulations that involve estimation of solutions to differential equations. Other kinds of computer simulation studies are also carried out in science. Perhaps best known are those involving cellular automata, where the discrete state of each node in a network is updated according to rules that reference the discrete states of neighbouring nodes. Some details of strategies discussed in Section 3—especially those related to code evaluation—would be different if these and other types of simulations were included in the analysis, but the more general epistemological points made in section 4 would remain the same.

[2] It is worth reiterating that model evaluation should be understood as an investigation of a model’s adequacy‐for‐purpose, not an investigation of its truth or falsity, whatever that might mean. A model that is constructed with the use of a variety of false assumptions about a target system might nevertheless be an adequate representation of that target system, relative to the goals of the modelling study.

[3] Campbell and Stanley (Citation1963) provide an early discussion. As there is some variation (and sometimes fuzziness) in the use of this terminology, the following is my own attempt to characterize these two kinds of validity. Guala (Citation2003) presents the distinction in a slightly different way.

[4] A realist perspective is not necessarily assumed here. The non‐realist could offer a different statement of the experimental result or could interpret the same result statement differently. For instance, while the realist might say, ‘On seven out of fifty trials, a collision of type C produced a particle with mass M ± ε GeV/c ²’, the non‐realist might say instead, ‘On seven out of fifty trials, after the experimental apparatus E was placed in arrangement A, the detector gave a signal of type S.’ Or the non‐realist might offer a result statement that is syntactically similar, if not identical, to the result statement given by the realist but intend it to mean that conditions during the experiment were ‘as if’ particles with mass such‐and‐such were produced.

[5] In the former, it is generalization of the simulation results (i.e. results about the behaviour of a computer) to conclusions about the continuous model equations that is at issue, while in the latter it is generalization of the simulation results to conclusions about a natural or social target system that is at issue (see also Parker forthcoming).

[6] ‘Relevant’ instances are ones for which there is reason to think that the quality of the apparatus’ performance will be indicative of the quality of its performance in the experiment of interest. Franklin does not emphasize this point about relevance, but it is important; I incorporate it explicitly into the simulation‐related analogues that I identify below, both for this strategy and for the next one.

[7] In collaboration with Howson, Franklin has offered a Bayesian analysis/justification for many of the strategies discussed in this section (Franklin and Howson Citation1988); in the interest of space, those analyses will not be rehearsed or examined here.

[8] It should be noted, however, that if model parameters have been tuned in an ad hoc fashion for the very purpose of ensuring a close match with past rainfall data, then the finding of such a match should result in little or no increase in confidence in the adequacy of the model for predicting future rainfall. This is easy to see if we think of Franklin’s strategies in a Bayesian framework, as he does: we get little or no boost in confidence when close model–data fit has been achieved through ad hoc tuning, since in that situation we expect the fit to be close and so have to assign p(e) ≈ 1 in the Bayesian updating equation, p(h|e) = [p(e|h)*p(h)] / p(e). Here, p(h|e) is the probability that the hypothesis (i.e. a statement about the target system) is true, given that such a close model–data fit was obtained; p(h) is the probability assigned to the hypothesis before the close model–data fit was obtained; p(e|h) is the probability of obtaining such a close model–data fit, given that the hypothesis is true; p(e) is the probability of obtaining such a close model–data fit. For the same reason, none of Franklin’s strategies will provide any significant increase in confidence if we have engaged in ad hoc fiddling in order to guarantee the demonstration that the strategy requires. This should be kept in mind throughout the discussion that follows.

[9] Weissart (Citation1997, 123) and Winsberg (Citation1999a, 39) both follow Franklin in calling this practice ‘calibration’. However, in the context of computer simulation modelling, calibration is sometimes synonymous with tuning, i.e., with adjusting model parameter values in an ad hoc way to achieve a better fit between model output and empirical data (see e.g. Oberkampf and Trucano Citation2002). It seems best to stick with the terminology of ‘benchmarking’, as suggested by Oreskes et al. (Citation1994).

[10] Moreover, this strategy applies only for simulations grounded in accepted theoretical principles; not all simulations fall into this category.

[11] Weissart (Citation1997, 123–124) suggests that decreasing the computational time step and rerunning the simulation model would also be an analogous strategy. This would seem to increase confidence even less than employing a different solution technique, since it seems more likely that solutions would err in the same ways when the simulations differ only in their time‐step lengths than when they also incorporate different solution techniques.

[12] In a footnote, Franklin offers a quote in which Holmes speaks of determining alternatives to be ‘impossible’ and concluding that what remains ‘must be the truth’ (see Franklin Citation2002, 250–8). It is clear, however, that Franklin does not take his Sherlock Holmes strategy (or any other) to deliver certainty concerning the validity of experimental results. It is the idea of rejecting alternative hypotheses, rather than doing so with certainty, that inspires his ‘Sherlock Holmes’ label for the strategy.

[13] The fact that there is relatively little reason to worry about ‘confounders’ here might be considered an epistemic advantage or strength of computer simulation. Note, however, that this does not mean that the algorithm in fact implemented as the computer simulation model is the algorithm that one intended to implement or an algorithm that estimates accurate‐enough solutions to the continuous model equations; various shortcomings in the design and programming of the algorithm are possible, but these are better categorized (given the characterizations above) as ‘sources of error’ than as ‘alternative explanations’ of the data.

[14] Franklin (Citation1989, Citation2002) discusses at least nine confidence‐building strategies. That only a subset of those strategies is discussed here does not seem problematic, since Franklin considers none of his strategies to be necessary for rational belief in experimental results (see Section 4). Four of Franklin’s strategies not discussed here relate to the following: observation of expected artefacts; explanation/prediction of the result by a theory of the phenomenon; coherence or naturalness of the results; and statistical considerations. Though I would argue that these also have analogues in the context of computer simulation, even if they do not, the five strategies just discussed seem sufficient to demonstrate that interesting parallels exist between the confidence‐building strategies available in the contexts of these two practices.

[15] Both of Franklin’s categories are encompassed by ‘sources of error’ in the broader sense of ‘ways that a study can go wrong’. In what follows, when speaking of ‘canonical sources of error’ or ‘canonical types of error’, I intend this broader meaning, i.e. ‘canonical ways that a study of this type can go wrong’.

[16] That is, in a wide range of cases, it is a strategy that is in principle appropriate to employ; it may be that in practice various plausible sources of error or alternative explanations of the results cannot be ruled out.

[17] More specifically, she likely would argue that if the strategies are epistemically significant, it is because they can sometimes be used (whether singly or in combination) in carrying out ‘severe tests’ of hypotheses concerning the absence of specific sources of experimental error (see Mayo Citation1996).

[18] This kind of probing of the implications of uncertainty in substantive modelling assumptions is being undertaken now by climate modellers who want to see how uncertainty in representing the climate system translates into uncertainty with regard to projections of future climate change (see Stainforth et al. Citation2005; Parker Citation2006; IPCC Citation2007).

[19] Indeed, the idea that a ‘plausible’ parameter value is one that is believed to be somewhat ‘close’ to the ‘real’ value of some quantity may need to be reconsidered or even abandoned in this context (see Smith Citation2006 for a related discussion).

[20] An error‐statistical epistemology of computer simulation, taking inspiration from Mayo (Citation1996; Citation2000), is another option. Parker (Citation2008) sketches the beginnings of such an account. An error‐statistical approach to the epistemology of computer simulation would overlap in important ways with the Sherlock Holmes approach presented above but would highlight the importance of ‘severe testing’ for error within a non‐Bayesian framework. There is not space here to explore whether there is a need to choose between error‐statistical and Sherlock Holmes approaches to the epistemology of computer simulation nor, if so, which should be chosen and why.

Stainforth , D. A. , Aina , T. , Christensen , C. , Collins , M. , Faull , N. , Frame , D. J. , Kettleborough , J. A. , Knight , S. , Martin , A. , Murphy , J. M. , Piani , C. , Sexton , D. , Smith , L. A. , Spicer , R. A. , Thorpe , A. J. and Allen , M. R. 2005 . Uncertainty in predictions of the climate response to rising levels of greenhouse gases . Nature , 433 : 403 – 406 .

PubMed Web of Science ®Google Scholar

Franklin, Holmes, and the Epistemology of Computer Simulation

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Franklin, Holmes, and the Epistemology of Computer Simulation

Abstract

Acknowledgements

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature