ODNI as an analytic ombudsman: is Intelligence Community Directive 203 up to the task?: Intelligence and National Security: Vol 34 , No 2

ABSTRACT

In the wake of 9/11 and the war in Iraq, the Office of the Director of National Intelligence adopted Intelligence Community Directive (ICD) 203 – a list of analytic tradecraft standards – and appointed an ombudsman charged with monitoring their implementation. In this paper, we identify three assumptions behind ICD203: (1) tradecraft standards can be employed consistently; (2) tradecraft standards sufficiently capture the key elements of good reasoning; and (3) good reasoning leads to more accurate judgments. We then report on two controlled experiments that uncover operational constraints in the reliable application of the ICD203 criteria for the assessment of intelligence products.

Acknowledgements

The authors would like to thank David Mandel and our many colleagues in the Melbourne CREATE research team for their comments, suggestions and advice. We would also like to thank two anonymous reviewers for comments that improved the manuscript. Finally, we thank the Co-Arg team at George Mason University and Good Judgment Inc. for permitting us to use their materials for this experiment.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1. Wohlstetter, Pearl Harbor: Warning and Decision, viii.

2. https://nsarchive2.gwu.edu/NSAEBB/NSAEBB129/nie.pdf (Accessed 10 May 2018).

3. Cardillo, “A Cultural Evolution,” 44.

4. See fn. 63.

5. One of the main goals of this study is to offer an empirical assessment of the rubric used by AIS in order to perform their task of evaluating the quality of reasoning in intelligence products. A future research project could build on the results presented here and assess how different refinements of the AIS rubric perform in terms of reliability and correlation to expert evaluations. Such a project should be informed by other research on evaluation rubrics, such as Zelik et al, ‘Judging Sufficiency’ and Zelik et al., ‘Understanding Rigor in Information Analysis’. Zelik and his collaborators interviewed professional information analysts and determined ‘consistent patterns’ of ‘critical vulnerabilities in information analysis’ (Zelik et al., ‘Measuring Attributes of Rigor’, 3). They organised these patterns into eight different attributes of shallow analysis and developed a metric based on them to evaluate the rigor of analysis in an intelligence report. Their research is relevant to the AIS rubric for three reasons. First, Zelik et al.’s metric is inferred from intelligence professionals with experience in evaluating the kind of reasoning encountered in intelligence products. Second, their metric also uses qualitative levels of satisfaction for each criterion (though they use three instead of four). Third, they investigated the robustness of their rigor metric by applying it in other contexts, such as accident investigation analyses (Patterson et al., ‘Insights from Applying Rigor’). We believe this is a promising avenue of research but we do not pursue it any further in this paper.

6. For example, The 2004 Senate Select Committee on Intelligence, President Bush’s Commission on the Intelligence Capabilities of the United States Regarding Weapons of Mass Destruction (Silberman and Robb, Report to the President), etc. A detailed discussion of the main findings of these committees can be found in Phythian, ‘The Perfect Intelligence Failure?’.

7. https://www.dni.gov/files/documents/IRTPA%202004.pdf.

8. Similar recommendation can be found in the recent Report on the Iraq Inquiry (Chilcot Inquiry) carried out in the UK which recommended that authors of intelligence products should appreciate ‘[t]he importance of precision in describing [their] position’ especially in the executive summary of the report. Moreover, it recommended that analysts ‘need to identify and accurately describe the confidence and robustness of the evidence base’, ‘to be explicit about the likelihood of events’, ‘to be scrupulous in discriminating between facts and knowledge on the one hand and opinion, judgement or belief on the other’, and finally ‘to avoid unwittingly crossing the line from supposition to certainty, including by constant repetition of received wisdom’ (Section 4.2.900).

9. This position no longer exists.

10. Fingar’s belief in the truly transformative nature of the reforms put forward by ODNI can be seen in his address to the 2007 Analytic Transformation Symposium in Chicago, Illinois: ‘The kind of changes embodied in the term analytic transformation, if they’re going to be successful, we’ll be revolutionary’. (https://www.dni.gov/files/documents/Newsroom/Speeches%20and%20Interviews/20070905_speech.pdf).

11. As Lowenthal puts it ‘[t]hese are not groundbreaking nor are they especially remarkable standards. In fact, most of them are fairly commonsensical but still mandatory’. (Lowenthal, “Towards a Reasonable Standard,” 307).

12. ICD203.

13. Fingar, Reducing Uncertainty.

14. ‘[T]he primary purpose of intelligence inputs into the decision-making process is to reduce uncertainty, identify risks and opportunities, and, by doing so, deepen understanding so that those with policymaking responsibilities will make “better” decisions’. (Ibid., 25). Cf. Friedman and Zeckhauser who argue that this widespread belief about the aim of intelligence analysis is misguided as it ‘can impair the accuracy, clarity, and utility of intelligence estimates. These problems frequently fall under one of two complementary categories. Consequence neglect occurs when collectors, analysts, and consumers of intelligence focus too much on the probability of each possible scenario and too little on the magnitude of those scenarios’ potential consequences. Probability neglect is the reverse problem, arising when intelligence focuses predominantly on the potential consequences of various possibilities while giving less attention to their respective likelihoods. When likelihoods and consequences are not identified separately and then considered together, estimative intelligence will be incomplete, unclear, and subject to misinterpretation’. (Friedman and Zeckhauser, “Assessing Uncertainty in Intelligence,” 824–5).

15. Kent, “Words of Estimative Probability”.

16. Immerman, “Transforming Analysis,” 163.

17. See note 3 above.

18. Ibid., 43.

19. Gentry, “Has the ODNI Improved,” 641.

20. Betts, “Analysis, War, and Decision,” 61. This view seems to have been shared by Thomas Schelling (see quote above).

21. Gentry, “Intelligence Failure Reframed,” 266ff. A similar argument can be found in Immerman, “Intelligence and the Iraq,” 477.

22. Immerman, “Intelligence and the Iraq”.

23. Phythian, “The Perfect Intelligence Failure”.

24. Ibid., 248.

25. Ibid., 249. A similar account can be found in Silberman and Robb, Report to the President, 100–105.

26. Zegart, “September 11,” 79–80. See also Zegart, “An Empirical Analysis,” 59.

27. Zegart, “An Empirical Analysis,” 54.

28. Gentry, “Has the ODNI Improved”, 638.

29. Gentry, “Has the ODNI Improved”; Marchio, ‘Analytic Tradecraft’.

30. Gentry, “Has the ODNI Improved”; Lowenthal, ‘A Disputation on Intelligence Reform’.

31. Gentry, “Has the ODNI Improved”.

32. Lowenthal, “A Disputation on Intelligence Reform,” 32. See also Gentry, “Has the ODNI Improved”.

33. Gentry, “Intelligence Failure Reframed,” 252.

34. Betts, “Analysis, War, and Decision,” 85.

35. See IRTPA and the discussion in Marrin, “Training and Educating”; and Artner et al., Assessing the Value.

36. Heuer and Pherson, Structured Analytic Techniques, 4.

37. Coulthart, “An Evidence-Based Evaluation”.

38. Ibid., 369.

39. Chang et al, “Restructuring Structured Analytic Techniques,” 1.

40. Ibid., 4.

41. Fingar, Keynote Address, 5.

42. Marrin, “Evaluating the Quality of Intelligence”.

43. Marchio, “Analytic Tradecraft”.

44. Friedman and Zeckhauser, “Why Assessing Estimative Accuracy”.

45. ‘While getting a judgment “right” is what ultimately matters most, the recipients of IC analytic products recognize that strong analytic tradecraft is more likely to result in assessments that are relevant and rigorous – what they need and value most’ (Marchio, “Analytic Tradecraft,” 182).

46. Tetlock and Mellers, “Intelligent Management of Intelligence Agencies,” 549.

47. Lowenthal, “Towards a Reasonable Standard”; Gentry, “Has the ODNI Improved”.

48. Fingar, Reducing Uncertainty, 109–111, 129–131; Friedman and Zeckhauser, “Assessing Uncertainty in Intelligence,” fn. 9.

49. Note, however, Immerman’s remark that ‘based on a small set of studies undertaken, there does seem to be a correlation between outstanding tradecraft and the accuracy of the product’. (Immerman, “Transforming Analysis,” 172) We did not have access to the studies Immerman is referring to.

50. Apud. Friedman and Zeckhauser, “Why Assessing Estimative Accuracy”.

51. Ibid., 185.

52. Ibid., 186.

53. Ibid., 187.

54. Ibid., 189.

55. Ibid., 191.

56. See note 42 above.

57. Marrin and Clemente, “Improving Intelligence Analysis”; Marrin and Clemente, “Modelling and Intelligence Analysis”.

58. Institutional approval for this study was gained on 20 September 2017 (reference: 17IC4179) and the study was preregistered on the Open Science Framework.

59. van Gelder and de Rozario, “Pursuing Fundamental Advances”.

60. https://www.goodjudgment.com/.

61. We did not have access to the aggregating method used for the Rating Scale in the practice of AIS.

62. Lowenthal, “Towards a Reasonable Standard,” 307.

63. See Jonsson and Svingby, “The Use of Scoring Rubrics”; Judd et al., Being Confident about Results; Reddy and Andrade, “A Review of Rubric Use”; RiCharde, “The Humanity versus Interrater Reliability” and Turbow and Evener, “Norming a VALUE Rubric”.

64. Due to no-shows only 10 out of the 13 reports used in Experiment 2 were evaluated by 3 raters.

65. The same selection of reports in Experiment 1 generated very low agreement for both the Equal Weights and the Weighted scoring system, ICC ≈0 (95% CI [−1.750,0.632]) and ICC = ≈ 0 (95% CI [−1.989,0.600]), respectively.

66. Immerman, “Transforming Analysis’,” 168.

Additional information

Funding

This research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research projects Activity (IARPA), under Contract [2017-16122000002]. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the US Government. The US Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.

Notes on contributors

Alexandru Marcoci

Dr Alexandru Marcoci is Research Assistant Professor in Philosophy at the University of North Carolina at Chapel Hill and a core faculty member in the joint UNC-Duke PPE program. He is also a Research Associate in the Centre for Philosophy of Natural and Social Science at the London School of Economics. He received an MSc in Logic from the Institute for Logic Language and Computation at the University of Amsterdam, and an MSc in Philosophy of Science and a PhD in Philosophy from the Department of Philosophy, Logic and Scientific Method at the London School of Economics. Dr Marcoci is currently collaborating with the SWARM project (part of IARPA’s CREATE program) on measuring the quality of reasoning in intelligence reports.

Ans Vercammen

Dr Ans Vercammen is a Research Associate at the Centre for Environmental Policy at Imperial College London and a Research Fellow in the School of Biosciences at the University of Melbourne. She has an MSc in Psychology, a PhD in Behavioural and Cognitive Neuroscience, an MSc degree in Conservation Science. Dr Vercammen is currently collaborating with the SWARM project (part of IARPA’s CREATE program) on measuring the quality of reasoning in intelligence reports

Mark Burgman

Prof. Mark Burgman is Director of the Centre for Environmental Policy at Imperial College London and PI of the SWARM project. He works on expert judgement, ecological modeling, conservation biology and risk assessment. He has written models for biosecurity, medicine regulation, marine fisheries, forestry, irrigation, electrical power utilities, mining and national park planning. He received a BSc from the University of New South Wales, an MSc from Macquarie University, Sydney, and a PhD from the State University of New York at Stony Brook.

ODNI as an analytic ombudsman: is Intelligence Community Directive 203 up to the task?

Notes on contributors

Alexandru Marcoci

Ans Vercammen

Mark Burgman

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

ODNI as an analytic ombudsman: is Intelligence Community Directive 203 up to the task?

ABSTRACT

Acknowledgements

Disclosure statement

Notes

Additional information

Funding

Notes on contributors

Alexandru Marcoci

Ans Vercammen

Mark Burgman

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature