2,400
Views
18
CrossRef citations to date
0
Altmetric
Articles

Cumulating evidence in environmental governance, policy and planning research: towards a research reform agenda

&
Pages 667-681 | Received 31 Jan 2020, Accepted 16 Apr 2020, Published online: 20 May 2020

ABSTRACT

This paper suggests that the field of environmental governance, policy and planning (EGPP) may be seen as an (emerging) scientific field, which can be characterised as ‘fragmented adhocracy’, explaining the widespread failure to produce robust and cumulative knowledge. We argue that in order to produce reliable knowledge and to become credible in the realm of policy and planning praxis, EGPP research needs a major reform impetus. To this end, we propose three areas for reform, which cover (1) an agreed canon of definitions shared within the community, while being open to reinterpretations and novel concepts; (2) the stronger use of meta-analytical methods such as the case survey methodology, or systematic reviews, to cumulate published case-based evidence; (3) a systematic recognition of the institutional, political and social context of governance interventions, which becomes increasingly important to the extent that meta-analyses reveal general patterns and trends which nonetheless vary with context. For each agenda item, we briefly formulate the motivating problem and an ideal-typical vision to strive for, and sketch out the pragmatic, epistemological and normative limits to its realisation. We close with overall reflections on our research reform agenda and suggest pathways for implementation.

As a consultant designing participatory processes, I seldom draw on social science research, because I am often questioning the validity of research results. My previous experience with research projects have been rather sobering – questionable methods, shaped by dubious assumptions. I did not feel that one should base decisions on such results. Therefore, I tend to view research as often being quite remote from praxis, disregarding potentially important aspects. Hence, I mostly rely on other practical experiences rather than research “results”. Such distrust is of course unfortunate for research which does produce reliable results which we could learn and profit from.

– E-mail from a consultant to the first author, June 2019 (own translation from German).

1. Introduction

There appears to be a growing unease among scholars of environmental governance, policy and planning that their research is hardly informing policy-making, despite a generally continued interest in the use of evidence by policy-makers (Nutley et al., Citation2019). Arguably, this is at least partly due to the limited ability of environmental social science to provide robust knowledge on the mechanisms through which policy and planning work towards environmental sustainability: We still do not know how and under what conditions governance interventions work towards effectively addressing urgent issues of environmental sustainability (Lange et al., Citation2019). This, our paper argues, is due to two main tendencies in the field: First, empirical ‘evidence’ is spread over a myriad of mostly individual case studies; while these are useful and necessary, little effort is made to cumulate knowledge and to integrate case-based evidence through meta-analytical research. Second, the literature is beset with a proliferation of incompatible and unclear concepts and a lack of consistently applied theories, which make knowledge cumulation a challenging task (see e.g. Tacconi, Citation2011; van der Heijden et al., Citation2019). As has recently been demanded for sustainability science more broadly (Pauliuk, Citation2019), we argue here that research in environmental governance, policy and planning needs to become more of a cumulative effort in order to ultimately inform policy.

This paper starts out from the assumption that environmental governance, policy and planning (EGPP) ultimately serve to improve environmental sustainability. Examples are legion: Participatory planning as mandated by the European Water Framework Directive aims to improve the ecological status of Europe’s waters (Newig & Koontz, Citation2014); collaborative governance aims to improve environmental conditions (Scott, Citation2015); the REDD+ mechanism encompasses policies aiming to reduce carbon emissions due to deforestation in developing countries (Zelli et al., Citation2017). While some governance interventions are found at the national or international level, many are implemented at relatively local levels – hence the myriad of individual case studies on adaptive management, participatory planning, collaborative, multi-level, scale adapted, polycentric, networked or hybrid governance. But do they deliver in terms of environmental sustainability? And under what circumstances?

Some preluding remarks before we dig deeper into these issues: This article is not about evidence-based governance. It is about the provision of robust, reliable social-science evidence that bears the potential for being used by policy-makers. Moreover, we make no apodictic claims regarding our diagnoses or reform agenda, which are largely based on our experiences, thoughts, and collegial exchanges in the field. We do not advocate a one-and-only way of doing EGPP research, and we acknowledge and appreciate the diversity of EGPP research approaches out there, which should not be threatened by any pseudo-hegemonic methodological discourse. All the more we would like to facilitate a fruitful debate on evidence cumulation and policy relevance in EGPP research. We invite the readers to challenge, discuss and refine our suggestions in the best tradition of academic debate.

We start our analysis in the subsequent Section 2 by sketching a diagnosis of the lacking evidence cumulation in the research area of EGPP. We suggest that EGPP may be seen as an (emerging) scientific field, which can be characterised as ‘fragmented adhocracy’ in the sense of Whitley (Citation2006), explaining the widespread failure to produce robust and cumulative knowledge.

In Section 3, we argue that in order to produce reliable knowledge and to become credible in the realm of policy and planning praxis, EGPP needs a major reform impetus (see Pauliuk, Citation2019 for a recent related call in sustainability science). To this end, we propose a research reform agenda covering three areas, which are presented and discussed in Sections 4, 5 and 6, respectively.

  • First, we argue in favour of an agreed canon of definitions shared within the community, while being open to reinterpretations and novel concepts. This could ideally be realised through wiki-supported common dictionaries.

  • Second, we advocate the stronger use of meta-analytical methods such as the case survey methodology, or systematic reviews, to cumulate published case-based evidence – drawing on both ‘successful’ and ‘unsuccessful’ cases. This may serve to distil overarching patterns (‘the intellectual gold’ in the sense of Jensen & Rodgers, Citation2001) from case-based research.

  • Third, we argue for a systematic recognition of the institutional, political and social context of governance interventions. This becomes increasingly important to the extent that meta-analyses reveal general patterns and trends that nonetheless vary with context. Here, we elaborate on what constitutes a ‘case’ of governance interventions as opposed to its ‘context’, and discuss challenges and opportunities arising in meta-analysis of integrating published case-based insights with knowledge on the respective context (which is currently seldom done).

For each agenda item, we briefly formulate the motivating problem and an ideal-typical vision to strive for, and sketch out the pragmatic, epistemological and normative limits to its realisation. We close with overall reflections on our agenda and suggest pathways for implementation.

2. Environmental governance, policy and planning as a scientific field of ‘fragmented adhocracy’

In scientific fields such as physics, medicine and epidemiology, scholars are subject to standardised definitions, concepts, methods and scientific practices. This enables the knowledge produced to be aggregated and transferred into the political realm, informing policies and regulatory agencies. By contrast, the field of environmental governance, policy and planning (EGPP) research appears highly dispersed (Plummer et al., Citation2013; Reed & Bruyneel, Citation2010; Tacconi, Citation2011; Visseren-Hamakers, Citation2015).

In our perception, the EGPP field resembles what has been called a ‘fragmented adhocracy’ in the sociology of science (Whitley, Citation2006). Fragmented adhocracies are characterised by high task uncertainty and low mutual dependence. Research, therefore, is rather idiosyncratic and misses strong coordinating mechanisms across research institutions to systematically link strategies and results. There is no single reputational organisation that could enforce common standards, so scientists do not have to make contributions that unambiguously fit to an existing research corpus. Goals that scientists contribute to tend to be fluid, broad, and contingent upon external pressures and local requirements. As the level of scientific professionalisation in terms of standardised competence criteria, work procedures and significance criteria across research institutions are relatively low, the field is more permeable for professional and non-professional outsiders than, for example, the natural sciences, both regarding its contributors and its audience. Accordingly, standards are fairly volatile and can be interpreted differently. The fragmentation discourages integrative, standardising and coherent theoretical frameworks and promotes empirical diversity. In fragmented adhocracies, theoretical frameworks and syntheses for overarching goals are produced nonetheless. They are unlikely to become dominating the entire field, though, as the field’s small groups sustain their strength in reproducing as legitimate reputational systems perpetuating their own common concepts, research objects and methodological approaches (Whitley, Citation2006). However, communities related to EGPP, for example sustainability transitions (Markard et al., Citation2012; Smith et al., Citation2010) or earth systems governance (Biermann et al., Citation2010), are slowly about to consolidate.

As different audiences and decentralised resources are available to the individual researchers, scholarly differences do not have to be resolved, but can be used to show their own originality. As Whitley (Citation2006) puts it,

Rather than co-ordinating their research with one another, or combating the ideas and results of opponents, practitioners [i.e. researchers] in these fields develop highly individual research strategies around distinct topics and problems often with idiosyncratic methods – or at least highly tacit and non-comparable ones – in order to obtain high reputations for originality. Differentiation of contributions is a higher priority here than co-ordination of results and contribution to the collective enterprise. [… .] The proliferation of case studies in the human sciences with the expansion of practitioners can be seen as part of this process preferring differentiation and security to co-ordination and challenge.

Of course, there are many severe epistemological reasons for the differences between fields like physics, medicine or public health on the one hand and EGPP on the other, that prove this comparison to be a bit unfair. One of the axiomatic differences being the former dealing mostly with quantifiable phenomena including natural laws, whereas the latter addresses per se nondeterministic phenomena such as human behaviour, institutions and human-environment-relations, producing results which are less generalisable. In social sciences, research problems and cognitive objects tend to be rather specific and context-sensitive (Whitley, Citation2006). Moreover, EGPP can hardly be regarded as an own discipline with a unified framework of theoretical approaches, methods and quality criteria. Instead, the field is cultivated on the one hand by scientists with very different disciplinary backgrounds such as political science, administrative science, social sciences, planning, engineering, ecology, geography and economics, who are loosely held together by a common research topic (i.e. the human-environment system). On the other hand, there are scholars with an interdisciplinary background in environmental studies, an interdisciplinary field increasingly gaining ground in university study programmes and only slowly developing common references, heuristic concepts and (to a lesser extent) theoretical approaches.

3. Knowledge cumulation

In addition to the supposed absolute priority of the value of originality in science (Merton, Citation1957), what researchers in the scientific field of EGPP share with their colleagues in all other fields is the growing incentive to publish as much as possible in order pursue an academic career (Hammarfelt, Citation2017). Journals first and foremost call for ‘originality’, fostering the trend to idiosyncratic research described above. The pressure to publish cuts time available for a deepened encounter with the works of others.

The literature referred to in the introduction as well as our own experience from researching, publishing and reviewing in the EGPP field are strong indicators that EGPP research, overall, hardly cumulates. It will, however, be important to gather robust and nuanced data to scrutinise how research in the field does (not) build on the work of others, how (un)ambiguouly specific concepts are used, and so forth. This would allow more targeted ‘responses’ than we are able to provide here.

If we are, for now, to accept the diagnosis of EPGG as a fragmented adhocracy, and if we want to alleviate our field’s fragmentation for the sake of producing evidence for better science and policy, we would need to put efforts not only in coordinating our substantive research practices, methodological standards and key concepts; but also, we would have to address incentive structures and suitable institutions of scientific knowledge coordination, cumulation and transfer in the institutional realm (see also Pauliuk, Citation2019).

In this paper, we focus on what we need in order to provide robust knowledge on how EGPP can work for (and against) the benefit of ecological sustainability. We thereby discuss the cumulation of evidence as a necessary (yet not sufficient) condition for both inner-scientific progress and, ultimately, for evidence-informed environmental policy-making. Evidence is defined by the Oxford English Dictionary as ‘the available body of facts or information indicating whether a belief or proposition is true or valid‘. Here in particular, we refer to the best available knowledge on either the state of an EGPP system or – more importantly – on how and under what circumstances EGPP interventions work. Best available knowledge means that at a given point in time, this is regarded as such by the EGPP community of scholars. Following Popper, we assume that EGPP evidence can never be proved but only contested and falsified. Evidence cumulates when research builds on findings of older research such that the understanding of EGPP advances. Technically, evidence cumulation can occur by either challenging (‘falsifying’) or by confirming – hence strengthening the validity of – existing research, or by adding nuances to existing research (e.g. by specifying context factors under which a previously studied EGPP intervention works). In a broader sense, knowledge cumulation refers to both cumulation of empirical evidence and of theoretical advances.

Knowledge cumulation comes with a broad spectrum and many challenges, and EGPP is by no means the only field in which a lack of cumulative research has been identified and denounced. Recently, Goyal and Howlett (Citation2018) have found for the field of policy learning (which is of considerable importance to, and partly overlaps with EGPP) that conceptual fragmentation and stretching prevail, and that despite decades of research, ‘the nature of research findings in this field have not been sufficiently cumulative to constitute an active research programme generating generalisable findings’ (p. 28), and ‘scholars continue to work in silos without much cross-fertilisation, or even conceptual and empirical sharing, of data, knowledge and insights’ (p. 29).

Quilley and Loyal (Citation2005) contrast knowledge cumulation in the scientific discipline of biology with ‘cumulative disarray’ in the established social science discipline of sociology. Not buying into reductionist tendencies in biology, the authors argue that the re-emergent holism in biology and its reference to the objective non-human world allows the discipline to be inherently cumulative in its knowledge production: ‘As science, evolutionary biology is cumulative. (…) There will always be new syntheses, but these will still be syntheses of cumulative perspectives and vantage points, in relation to a natural world with which we are becoming increasingly familiar’ (Quilley & Loyal, Citation2005). In contrast, after sociology turned away from its early erroneously deterministic socio-biological understanding of the social world and parts of the discipline bought into a radical socio-constructivist world view, ‘the illusion of any kind of paradigmatic consensus has been shattered’, the authors state (Quilley & Loyal, Citation2005). Quoting Dunning and Mannell (Citation2003, p. 1), Quilley and Loyal (Citation2005) find that ‘[s]ociology remains ‘a multi-paradigmatic or multi-perspectival subject … conflict ridden … [and without any] overall consensus … regarding concepts, theories and methods’’, and that a majority of sociologists abandoned ‘the very idea that the investigation of social processes can be scientific, and by implication (…) the idea that it should be possible to build up, over time, a social-stock of reality-congruent ideas about the operation of social processes’ (original emphasis). This discussion is mirrored in the debates and substantial problems of modern sociology when encountering the issue of the generalisation and replication of (necessarily contingent) research findings (see e.g. Freese & Peterson, Citation2017; Larsson, Citation2009; Payne & Williams, Citation2005).

Against this background, many have argued that social science needs to become more ‘scientific’. Much of this refers to methodology, and unified methodological frameworks have already been proposed (see e.g. Gerring, Citation2015). We will not discuss social science methodology as such, which is treated in numerous books, of which King et al. (Citation1994) ‘Designing Social Inquiry’ is just one – perhaps the most prominent – example. Of course, sound methodology is a precondition for cumulating evidence, both regarding the very cumulation and the studies (or evidence) that are to be cumulated. However, even if we as a scientific community would rigorously apply decent social science methods, cumulation would still be obstructed by the lack of shared common concepts. This is what we are going to focus on in the next section.

4. Develop common concepts and frameworks

The incentive-structure of the current academic publication system in the field of EGPP rewards the development of novel concepts at the expense of applications of existing concepts. Many of our key concepts are rather vague, or – at least – have different meanings in different contexts. Take, for example, the key term ‘governance’. Its (implied) meanings and connotations – what someone means when they use the term ‘governance’ in a certain context – range from ‘governance as opposed to government’ (e.g. Rhodes, Citation1997), to ‘good governance’ as mostly used in the development context (e.g. Weiss, Citation2000) and to broadly referring to political steering with or without non-state actors (e.g. Kooiman, Citation2003). This inconsistent use of the term ‘has limited the scope for developing a cumulative body of knowledge.’ (Jordan, Citation2008, p. 28).

To give another example with particularly stark differences of meaning, the concept and research practice of ‘transdisciplinarity’ is intensively discussed in the scientific community. To some, notably from Anglo-America, ‘transdisciplinarity’ refers to a strongly integrated form of interdisciplinary research (e.g. Klein, Citation2004). To others, mostly from a European context, the term refers to research aiming to address societally relevant problems and to produce ‘socially robust’ knowledge by involving relevant scientific disciplines and non-academic actors into the research (Hirsch Hadorn et al., Citation2006).

Similar to the term ‘governance’, many concepts have been conflated with often normatively positive connotations. These include ‘social learning’ (which, often implicitly, assumes pro-environmental behaviour, as diagnosed by Reed et al., Citation2010) or ‘adaptive management/governance’, which not only is often conflated with notions of stakeholder participation (Stringer et al., Citation2006), but also has been confused with ‘climate change adaptation’. While the former mostly refers to adapting interventions following close monitoring of success, the latter refers to measures alleviating the (local) consequences of climate change.

In turn, new terms are created or well-known terms are given new meanings. For example,

the term ‘experiment’ makes for a highly pliable catch-all term used by academics to address the testing, piloting, and demonstrating of novel policy-designs by policymakers and practitioners. These are processes that have for long been at the core of policymaking. The term ‘experiment’ may however give the illusion of a scientific approach to these efforts. (van der Heijden, Citation2015)

The diagnosed problems with terms and concepts also occur with theoretical approaches on causal mechanisms. Several studies analysed how certain theoretical approaches have been used in the field, e.g. Punctuated Equilibrium Theory and Institutional Isomorphism (van der Heijden & Kuhlmann, Citation2018) as well as the Multiple Streams Approach, the Advocacy Coalition Framework, the Narrative Framework Theory, and the Institutional Analysis and Development Framework (van der Heijden et al., Citation2019). The authors find that these theoretical approaches, when applied in the field, are often used rather selectively, lack sufficient operationalisation and are short of rigorous causal analysis. Moreover, they are usually applied in small-n studies in Western democracies and suffer from conceptual stretching. Again, a more integrated take on these approaches and a focus on operationalisation and methodology may help to resolve the mentioned issues in the field and harvest and further develop explanatory power.

There are, however, counter examples. Take, for instance, the terminology on ‘type I’ and ‘type II’ multi-level governance, as defined by Hooghe and Marks (Citation2003), a paper which is cited more than a 1000-times in Scopus, with more than one fourth of citations by environmental science journals. While certainly these two types are imprecise to a certain degree, one should expect only little ambiguity in their usage because of the distinct reference to Hooghe and Marks (Citation2003).Footnote1

Arguably, the EGPP-field’s inherent interdisciplinarity and the fuzziness of its boundaries also contribute to conceptual ambiguity and inconsistency. If boundaries of the research field were clearly defined (the same being true for any research field), it would be easier to delimit the ‘reach’ of certain concepts. To give an extreme example: The concept of ‘power’ is entirely different in physics and in political science. The fact that this does not lead to any conceptual confusion is due to the sharp difference between the two scientific disciplines. The question of what exactly defines the boundaries of EGPP is almost impossible to answer at this stage. As sustainability science in general, the field of EGPP is inspired by different established disciplines and hard to delimit.

The widespread lack of consistent terminology within the field of EGPP is impeding the actual challenging of ideas and – in particular – empirical findings, and hence the cumulation of knowledge, as described by Whitley (Citation2006) for fragmented adhocracies. We argue that what is needed is an agreed canon of definitions shared within the community – while still being open to useful reinterpretations and novel concepts. We are well aware that an entire standardisation of concepts may never be attainable, nor desirable. Still we find it useful to explore the extreme case of an ideal-typical standardisation of concepts as common in the hard sciences – before then discussing the limits from different perspectives.

Ideally, EGPP concepts would be unanimously shared by the community with as little ambiguity as possible. At the lowest level of standardisation, stark semantic differences would be resolved. Hence, the term ‘transdisciplinarity’ would either refer to a strong form of interdisciplinarity or to a research mode involving extra-academic knowledge – but not to both, depending on the usage. A higher level of standardisation would imply commonly shared definitions and concepts that bridge definition and operationalisation. According to the seminal work of Goertz (Citation2011), these concepts may consist of dimensions and indicators that make them directly operationalisable and therefore empirically applicable. In a field where the same term is used by different researchers, or schools, with different meanings, and, vice versa, where multiple terms exist for essentially the same phenomenon, procedures will be needed to determine valid definitions of terms.

This function could be taken by dictionaries, which, regularly updated, would represent the current state of the art in definitions of terms. Currently, only two dictionaries exist in the field, namely the ‘Dictionary and introduction to global environmental governance’ (Saunier & Meganck, Citation2009); and ‘A Dictionary of Environmental Economics, Science, and Policy’ (Grafton et al., Citation2001). Neither of them has sufficient standing in the community to truly guide the usage of terms (neither of the dictionaries reach 100 citations in Google Scholar, which is a rather poor record). In the wider field, there are the ‘Dictionary of environment and sustainable development’, including planning and management (Gilpin, Citation1996; cited 132-times in Google Scholar); and the ‘The concise Oxford dictionary of politics’ with a few entries regarding the environmental or sustainability (McLean & McMillan, Citation2009, first edition 1996; cited 594-times in Google Scholar). Also, a few encyclopaedias are available, but many resemble more handbooks with articles on basic topics rather than definitions of concepts, and, as with the dictionaries, none of them is cited even close to 100-times.

Hence, a widely accepted dictionary would be needed, with a clear procedure to include the latest state of thinking of key concepts in the EGPP field, being essentially open to every scholar in the field, potentially involving a wiki platform for discussions on definitions (see, e.g. https://encyclopedia.pub for a recent attempt in this direction). Such a carefully crafted institutionalised procedure would have to ensure fair access to and rational deliberation on collective decision-making regarding definitions of concepts. It would have to involve professional societies, journal editors and of course us as individual researchers. Definitions would then need to be agreed upon until the next edition is released. Shared procedures will need to ensure that reinterpretations of existing concepts and possibly new concepts are regularly introduced into updated versions of the proposed dictionaries. Criteria for selection and definition of commonly agreed terms should be usefulness and compatibility to existing concepts as well as a low degree of semantic overlap with other concepts.

Such standardisation would allow for a direct comparison of individual research findings, such that one study either confirms, challenges or adds nuances to existing research – all of which are currently hardly possible because the different usage of terms implies that many studies speak past each other. While it may seem that an agreed canon of definitions would restrict research liberty, it could in fact be the opposite. We believe that commonly shared definitions will relieve researchers – in particular those working empirically – of the perpetual need to define how they understand a particular concept and hence release capacities for actual (empirical) research.

Nonetheless, such a shared dictionary procedure should not slow down academic progress. For conceptual innovations put forward in academic articles, we propose what one might call a reversal of the burden of proof: Whereas currently, authors are incentivised to create new concepts (because this is ‘innovative’ and therefore an argument for publishability), we suggest that authors who introduce new concepts or re-interpret existing ones are expected to justify – more clearly than is common usage now – what is added by these as compared to existing concepts and definitions.

Beyond shared concepts and variables, more overarching frameworks will be of importance. As Ostrom (Citation2009) has put it: ‘Without a common framework to organize findings, isolated knowledge does not cumulate’ (p. 419). Frameworks include standardised sets of variables and their operationalisations, to be usefully employed by all research addressing a common research object. On the one hand, common research frameworks and protocols may guide specific research practices. On the other hand, applying them allows for the development of case data bases for comparative and cumulative research (see e.g. the emerging Collaborative Governance Database, www.successfulpublicgovernance.com/succesful).

Certainly, there are limits to standardisation of terms, concepts and variables. Epistemologically, social sciences are different from ‘hard’ sciences, as mentioned above in Section 2. Fundamentally, many concepts are bound to a societal context – and hence to geographically and temporally varying circumstances. This is possibly one of the most crucial and contested questions to be discussed (see e.g. Reed & Meagher, Citation2019).

Procedurally, shared dictionaries and similar approaches would require enormous efforts and face huge challenges. How to avoid that biases inherent to the current academic system – dominated by Western democracies – multiply when standardisation ultimately restricts researchers in their ‘free’ usage of terms? The implementation of a dictionary and respective procedures would place high demands on the dictionary’s organisational and scientific management, the scientific community and each individual concept user. Nobody can nor should be forced to subscribe and commit to a dictionary’s development and use. Discussions on such issues should therefore include a broad spectrum of scholars within the field of EGPP research.

5. Evidence cumulation through meta-analytical and comparative research

Cumulation of research means that individual research evidence builds on other research such that the state of scientific knowledge progresses – in the EGPP field regarding what governance interventions work under what circumstances. In order to obtain strong evidence for science and policy, a first step is to synthesise the already existent evidence which is dispersed across many individual (case) studies (Grönlund & Åström, Citation2009; Parkhurst, Citation2017, p. 120). Ideally, this would require the first agenda item – common concepts and research practices – already to be resolved, allowing for comparability of results and methods.

Several approaches and methods exist to synthesise individual studies, depending on both the nature of the individual studies (qualitative or quantitative) and on the method of aggregation (qualitative/narrative or structured/quantitative) (Newig & Fritsch, Citation2009a). The most widespread form of knowledge cumulation from individual studies is the systematic review. While being fully transparent (and systematic) about the inclusion and exclusion of studies, systematic reviews use a qualitative or narrative form of research synthesis. Ideally, a systematic review serves to distil key insights from a clearly defined set of studies. The ‘Preferred Reporting Items for Systematic Reviews and Meta-Analyses’ (PRISMA), developed originally for the field of medicine and health care, provides an established standard for reporting inclusion and exclusion of studies (Moher et al., Citation2009).

In the field of medicine, the Cochrane Collaboration has set important standards and provides tens of thousands of systematic reviews on the effectiveness of medical treatments (https://www.cochrane.org). The Collaboration for Environmental Evidence (https://www.environmentalevidence.org) may be seen as a step in this direction for the EGPP field, but as of now, governance, policy and/or planning issues have not yet been addressed. The Cochrane and Campbell collaboration standards, however, when applied to social sciences, are criticised for their supposedly inadequate way of proofing causation by looking for repeated occurrences of outcomes (Pawson, Citation2006). In ‘realist’ systematic reviews, the reviewer therefore looks for the causal mechanisms and eventually theories that are examined or assumed in the literature and searches for related evidence in order to refine them, allowing for more diverse sets of answers to the question of what works (Pawson et al., Citation2005, see also next section).

If quantitative studies are synthesised and this is done in a quantitative way (usually drawing on statistical tools), then we are dealing with a meta-analysis in the strict sense. While adhering to similar standards on inclusion of studies as systematic reviews, meta analyses seek to rigorously synthesise previous studies, often by averaging standardised correlation coefficients or effect size measures, followed by significance tests and regression analyses on effect sizes (e.g. Glass, Citation1977; Hunter & Schmidt, Citation2004). By combining findings from different studies, meta-analyses allow to generalise to a larger population, enhance statistical power, and consider the role of different contexts in shaping the relationship between intervention and outcome variables. However, such meta-analyses are virtually non-existent in the EGPP field simply because few statistical studies exist in the first place. As we have not yet seen multiple quantitative studies on the same subject, meta-analysis in this strict sense will, if they are used at all, remain the exception in the field (Fritsch & Newig, Citationin prep.).

The most common form of empirical EGPP research is the qualitative case study. As a way to synthesise multiple case studies on a common topic in a structured way, the case-survey method – or case-based meta-analysis – has been developed (Jensen & Rodgers, Citation2001; Larsson, Citation1993; Lucas, Citation1974; Yin & Heald, Citation1975). Although the method has been around for some time now, it has relatively seldom been used, despite its great potential to the field (Fritsch & Newig, Citationin prep., identified 31 case surveys until 2018 in the broader area of public policy).

Notably the case survey method allows compensating for a lack of consistent terminology by developing a coding scheme through which individual case studies (which are typically already part of the published record) with varying terminology are processed and systematically compared – provided that the included studies offer enough detail to infer how terms and concepts are in fact understood. To this end, a coding scheme has to be developed that contains definitions that are as much as possible unequivocal and allow for similar coding results independent of the person who is conducting the coding. This way, qualitative case narratives can be transformed into quantitative data (see Newig et al., Citation2013, Citation2019 for an example of such an effort). Using averages over multiple coders’ assessment helps to strengthen the reliability of such interpretative exercises. Quantitative data allow for structured evaluation with statistical or otherwise structured methods such as Qualitative Comparative Analysis.

There are of course many limits to meta-analyses. The more idiosyncratic and non-generalisable the original research is, the more difficult and risky is its meta-analysis. Apart from literature reviews and theory development, cumulating evidence usually requires either quantitative source material or the quantification of qualitative source material in order to conduct a meta-analysis. The necessity of quantification limits both the types of data that can be processed and the types of results that can be produces through meta-analyses. Meta-analyses in general and quantification in particular always strongly reduce information. The cases are stripped from their richness and context and are in danger of being reduced to a level that does not do justice to the cases anymore. This increases the probability of misinterpretations. In some cases, the meta-analytical researcher may even investigate another research question than the authors of the included studies did. In these cases, the data the meta-analytical researcher is looking for may not be there, or is at risk of being even read into the respective study only. Or the data may be strongly selective and skewed as it is only a side product of the original research, never intended to being in a research focus. Decent case surveys try to reduce these risks by a comprehensive coding scheme and multiple coders for each case, aiming for a high inter-coder reliability. However, strong biases present in the data and translation losses regarding both substantial concepts and data transformations cannot be fully prevented. Meta-analyses and their results therefore require cautious and reflective interpretation as well.

Now that we have sketched out possibilities and limits to cumulate already existing research with meta-analytical methods, a second step would be to encourage new research that is comparative in nature and that strives for causal inference from the very beginning. Comparative research includes everything from qualitative comparative case studies (at least two) to experiments to large-n quantitative studies. Whereas small-n case studies allow for ‘deep causality’ and multi-facetted, context-sensitive descriptions and analysis, they usually fail to establish overall causal patterns and generalisable results. Large-n studies, on the other hand, are able to produce generalisable results and identify correlations, but fall short on exploring deeper causal mechanisms, especially since panel data is often not available. With Qualitative Comparative Analysis (QCA), there is a set-theoretic methodological approach suitable for mid-n studies. Looking for conjunctions of necessary and sufficient conditions that cause an outcome, QCA accounts for causal complexity (Ragin, Citation2000; Schneider & Wagemann, Citation2012). What we need are studies with research designs that methodologically allow for triangulation, combining qualitative and quantitative research approaches. Single case studies will always play an important part in establishing thick descriptions, critical perspectives or complex causality via process tracing (Bennett & Checkel, Citation2015). But purposefully designed comparative research, even when conducted with only a small set of cases, is better suited to explain why what works, and how (Blatter & Haverland, Citation2014).

Apart from purely methodological improvements, there are two additional key factors to facilitate meta-analytical and comparative research: First, for causal inference, not only better research designs, but also explanatory theoretical approaches are required for interpreting data relations as causal relations. This also allows for cumulating knowledge on a theoretical and conceptual level. Purely empirical research often is too shallow (and descriptive) in this regard and may be complemented with theory-driven empirical analysis of EGPP. Second, Comparative and meta-analytical research can benefit from developing and using joint databases and standardised repositories for storing and accessing case study data. For example, the platform Participedia.net allows entering and searching case studies of public participation (with currently 200+ environment-related cases out of 2000+ cases in total). Using a wiki system, cases are entered in a somewhat structured way, allowing for easy access but not for quantitative analysis.

6. Mind the context: towards a multi-level framework of governance interventions

Scholars and practitioners with a strong instrumental policy orientation are particularly interested in ‘what works’, which of course depends on the context: Under what circumstances works what for whom, and how and why does it work (Sanderson, Citation2002)? Elinor Ostrom’s famous research on how self-organised governance ‘works’ towards the sustainable use of natural resources (Ostrom, Citation1990) has been path-breaking not least because it is prudently limited to a rather specific context: that of relatively local settings with clearly defined boundaries, involving not too many actors. Repeated attempts to either ‘upscale’ her findings to larger contexts, or to transfer them to different sectors have remained relatively unsuccessful (see, e.g. Cashore & Bernstein, Citation2020). Context matters, and context-insensitive ‘panaceas’ likely fail.

One strand of methodology which shares our goal of doing scientifically rigorous context-sensitive and at the same time policy-relevant research is so-called ‘realist evaluation’ (Pawson, Citation2013; Pawson & Tilley, Citation1997). Realist evaluation addresses similar questions and employs similar methods and scientific standards as explanatory EGPP research that examines policy interventions and their contexts in a cumulative manner. We can learn from the school of realistic evaluation that context-mechanism-outcome (CMO) configurationsFootnote2 are suitable policy-relevant key entities of analysis, enabling evidence cumulation and theory building, testing and refining (Pawson & Tilley, Citation1997). Pawson and Tilley speak of CMO configuration focussing to describe the process of adapting the mechanisms to a local context and of CMO configuration abtraction to describe the creation of middle-range theories and analytical frameworks that enable meaningful comparisons and families of answers to the initial question. In realistic evaluation, these processes – when being informed not only by the respective case, but also by already existing research and data on other relevant cases and CMOs – eventually lead to cumulation by theory building (in contrast to cumulation by purely empirical generalisation and replication, which usually do not work in social sciences).Footnote3 In CMOs, the context conditions, i.e. enables or disables, the intended change mechanism, which – in its causation – is therefore contingent on the contextual variance within and between programmes (Pawson, Citation2013; Pawson & Tilley, Citation1997).

We will see that the discussion of context is inextricably linked to the question of what is a ‘case’ in EGPP, and hence, where leverage points for EGPP interventions are located. What counts as a case and what counts as a circumstance or context depends very much on the research object. For example, a particular national environmental policy can be either a case for a study with a national focus, or a key context variable for a study focusing on the subnational (regional or local) level. Working towards an integrative, multi-level framework of EGPP interventions, the following leverage points could be identified – which broadly collapse governance levels from top to bottom with institutional levels of constitutional choice, collective choice and operational choice in the sense of Kiser and Ostrom (Citation1982)Footnote4:

  • The overall institutional system (typically of a country, but supra-national structures such as the European Union or international regimes will likewise be important). It comprises the polycentricity, institutional fragmentation and multi-layeredness of decision-making systems, including its dynamics such as decentralisation, spatial scaling and institutional fit; policy ‘streams’ (Kingdon, Citation1999) and ‘landscape’ developments (Geels, Citation2002); administrative culture including policy experimentation and systematic learning (Newig et al., Citation2016). As a context factor, it is important to study its impact on policy change and local governance processes. As a leverage point for interventions (essentially the other side of the coin), the question is how to design institutional systems that best allow for effective EGPP mechanisms on national and sub-national level.

  • Major policy change (including policy mixes), typically on a national level (but also on supra – or subnational level). Major policy decisions serve to trigger, guide and shape transformation through enabling and fostering (niche) innovation (Raven, Citation2012), through fundamentally re-structuring a sector (e.g. mandated phase out of nuclear energy); or through major infrastructure or other investment programmes, and often require sub-national or local implementation.

  • Local EGPP processes, including implementation of higher-level policies. They determine how decisions are made, often implementing major policy decisions. Here in particular, different modes of governance (Driessen et al., Citation2012) can be considered. For example, in what stages and to what extent are private sector and civil society organisations, or even broader sections of the public, involved (Emerson & Nabatchi, Citation2015; Newig & Fritsch, Citation2009b)?

Depending on the focus of analysis, the former two can both figure as interventions (i.e. changing the political system, enacting or changing grand policies) and as context (political system as context for policies and their implementation; policies as context for local implementation decisions).

These distinctions will be particularly relevant when it comes to integrating case-based evidence through meta-analytical methods. Many if not most EGPP case studies are available on relatively local(ised) interventions. Arguably, much of the effectiveness of interventions depends on the context: political and institutional conditions as described above, as well as cultural norms, customs and practices that vary with time and space. How should a case survey of – say – local adaptive governance processes pay attention to these contextual factors? (1) One obvious source would be the original studies included in the case survey themselves. However, very often they will only report on specific circumstances (such as the environmental problem at stake, the prehistory of governance attempt before introducing adaptive governance etc.) but usually not treat the broader political and cultural system, current environmental policies, experimentalist traditions or aspects of meta-governance, which for many readers may be taken for granted. But it is precisely these contextual conditions that matter when comparing case studies from very different geographic locations, or across larger time spans. (2) Another source of contextual knowledge could be academic publications on these contexts. If, for example, there were a recent policy stipulating experimentation and adaptive governance in a particular country, a published analysis of this would provide important contextual knowledge for the local adaptive governance studies within this country. Pursuing this path would result in a sort of ‘multi-level case survey analysis’, in which local case studies are embedded within studies of (national) policies and/or institutional systems. To our knowledge, this has not been attempted so far. Indeed this procedure would risk to include imbalanced context information, which may vary greatly from country to country. (3) A third source of contextual knowledge would be databases on country characteristics (see Rose, Citation2020 for an overview). However, these do not cover all countries to an equal extent, with a bias towards reliable statistics available mostly for countries of the Global North. Hence, information from databases could be usefully combined with insights from academic publications.

One of the most challenging tasks will be to find the ‘optimal’ scale for contextualisation, or generalisability. Neither do ‘universal laws’ exist, nor can we rely for policy advice on highly contextualised knowledge applicable only for one unique case. What appears most helpful, therefore, is a medium degree of ‘contextualisation’ of evidence that allows researchers to transfer abstracted case knowledge and to contribute to middle-range theories. Ideally, access to aggregated empirical research results should allow practitioner-analysts to adjust the ‘scale’ of universality or specificity themselves.

Following up our excursus on realist evaluation, ‘realist synthesis’ suggest programme theories (more abstract versions of policy mechanisms such as target setting or legal restrictions) as points of departure for meta-analyses that take context-dependencies systematically into account. In this kind of inquiries, sets of programmes with family resemblance (i.e. a programme types) are analysed. Once a programme theory is chosen, reviews can be conducted to analyse the theory in different programme domains and eventually single programmes, thereby identifying context-sensitive sub-theories and eventually context-mechanism-outcomes configurations. Suggested principles for context-sensitive, cumulating, synthesising inquiries include: (programme) theory as unit of analysis; conceptual abstraction create enable a common language and differentiated analyses, use of reusable conceptual platforms related to sets of programme theories to avoid starting from scratch and to allow for cumulation; model building to specify the conditions under which each programme theory works; and adjudicating between rival hypotheses on why a programme works (Pawson, Citation2013). Since in the real (social) world – as opposed to the laboratory – contexts are constantly changing, they will however, always set limits to cumulability (Pawson & Tilley, Citation1997).

7. Conclusions

Where will the field head to? Will EGPP research be prone to ‘death’ if not integrated more strongly (see Banville & Landry, Citation1989 on a different research field)? Given the structure of the field, we daresay that this is not likely. It will just continue to produce largely non policy-relevant research.

With the reform agenda put forward here, we have considered areas that individual researchers can pursue. In addition, we also need stronger institutions (Pauliuk, Citation2019). As van der Hel and Biermann (Citation2017) note, many existing science institutions in the field of sustainability governance – such as Future Earth – fail to deliver on their claims and are lacking the required political authority. Moreover, funding agencies should require applicants to clearly lay out in what way their research builds on that of others and how they contribute to cumulating research, to building a common body of knowledge and evidence. Hence, cumulation of (case based) research ought to be more strongly rewarded by the funding system – but also by the publishing system, addressing publishers and editors. For example, journals such as JEPP could encourage special issues, or institutionalise special sections, particularly devoted to meta-analytical research.Footnote5

Certainly, science-policy interfaces are needed which facilitate the exchange between the two societal spheres and the uptake of EGPP findings. And yet – to return to the initial quote preceding this article – the EGPP community first needs to ‘deliver’ lest research results will continue to be disregarded by decision-makers in policy, planning and public administration, or used selectively for instrumental political reasons only (Nutley et al., Citation2019).

While the proposed research reform agenda certainly has a positivist tone, we readily acknowledge that interpretive research is certainly needed to deeply understand and criticise EGPP. Our aim is not to replace or abandon critical interpretive and more holistic ‘thick description’ research, but to transform the more ‘explanatory’ one to allow for evidence cumulation.

An additional aspect we should reflect on even more is the interdisciplinarity of the field of EGPP. Is it appropriate to demand common concepts, research practices and designs from a scientific field that is as of now strongly interdisciplinary? Are we imposing disciplinary standards on an interdisciplinary field, do we discipline the interdisciplinary? And doesn't it also have its benefits to work in a ‘fragmented adhocracy’? Our provisional answer is probably a hesitant ‘yes’ to all of these questions. On the one hand, it is a good thing to have open borders, allowing for many different inter-, trans- and disciplinary perspectives on a broad common research object. As we have tentatively diagnosed in Section 2, EGPP is inter- and multidisciplinary, loosely held together by a common topic. This allows for mutual learning and problem-driven research. It accounts for the interconnectedness of the human-environmental system and in principle enables a holistic perspective that is often lost in specialist disciplinary research. One the other hand, the field of EGPP is both expanding and consolidating, as is indicated, for example, from the increasing number of relevant journals and the forming of a scientific community around EGPP. This is an opportunity to actively shape these transitions. From our point of view, further fragmentation will not help us. Building the foundations for cumulating evidence, in contrast, can promote both scientific and political progress in the long run. To reach this goal we would not have to give up our interdisciplinarity, but we would need to work more closely together.

Acknowledgements

We would like to thank Nicolas Jager, Jeroen van der Heijden, the participants of the JEPP@21 workshop and two anonymous reviewers for constructive feedback on earlier versions of this article.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes on contributors

Jens Newig is professor of governance and sustainability and head of the Institute of Sustainability Governance (INSUGO) at Leuphana University Lüneburg, Germany.

Michael Rose is a post-doctoral researcher and lecturer at the Institute of Sustainability Governance, Leuphana University of Lüneburg, Germany. He holds a Dr. phil. (equivalent to PhD) and a diploma (equivalent to a master's degree) in political science.

Notes

1 The actual usage and application of the typology by authors citing Hooghe and Marks (Citation2003) would warrant an empirical analysis. For other highly cited works, van der van der Heijden and Kuhlmann (Citation2017) find that usage of unambiguously referenced concepts may still turn out ambiguous.

2 Note that in original realistic evaluation, these mechanisms refer to the theories of change (drawing on stakeholders’ choices/reasoning, capacities/resources and embeddedness and therefore the interplay of agency and structure) that practitioners, stakeholders and policy makers have in mind when developing and implementing a programme.

3 See Section 2 for related problems in the field of EGPP.

4 Hill and Hupe (Citation2003) warn against confusing geographical ‘layers’ and institutional ‘scales’. While there is a point in this, in the practice of multi-level governance systems, fundamental policy decisions are typically made on higher jurisdictional levels, while implementation typically occurs locally.

5 We thank one anonymous reviewer for explicitly suggesting this point.

References