1,571
Views
13
CrossRef citations to date
0
Altmetric
Letters

The need for open science practices and well-conducted replications in the field of gambling studies

, & ORCID Icon
Pages 369-376 | Received 04 Sep 2019, Accepted 22 Sep 2019, Published online: 29 Sep 2019

A field of study is only as strong as the research it produces. Although catchy and surprising results may generate acclaim, the research has actual value to the extent that the findings are true – they must be replicable (Popper, Citation1959; Schmidt, Citation2009). In the field of gambling studies, the scientific merit of published research has added importance given its ability to influence the regulation of gambling as well as efforts to prevent and treat disordered gambling. Unfortunately, there is reason to believe that many fields in the social sciences have a weak rate of replicability (Camerer et al., Citation2016, Citation2018; Open Science Collaboration, Citation2015), a revelation that has led to the so-called ‘replication crisis’ or ‘credibility revolution’. As a result of this crisis, attention has been directed at why many published findings fail to replicate. Central to these discussions has been: 1) low statistical power – the probability that a study will reject the null hypothesis (H0) when a specific alternative hypothesis (H1) is true – in published empirical research, and 2) the lack of open science practices (e.g. being explicit about how p-value(s) were obtained, posting data and materials to an online public repository). Low statistical power and the lack of open science practices decrease the replicability and credibility of reported findings. The net result has been decreased confidence in the results of published research and reforms aimed at improving new reports’ credibility.

Regrettably, the field of gambling studies has lagged behind other fields in recognizing this replication crisis and taking appropriate action. To our knowledge, as of 2019, not a single session of any academic gambling-oriented conference has been dedicated to the replication crisis, and not a single paper (until this issue of International Gambling Studies) has been published on the relevance of the replication crisis for the field of gambling studies. A positive sign is International Gambling Studies’ move toward encouraging contributors to fully disclose their research practices (i.e. open science). We laud the Journal for this change. We also laud International Gambling Studies for opening its pages to replication-based reports. This is a much needed change – a change that will help all stakeholders (e.g. researchers, policymakers, industry-related professionals) assess the replicability of past work in gambling studies. That said, we caution stakeholders that not all replications are created equal. It is important that all interested parties come to understand what constitutes a convincing replication, and why there is a need for a credibility revolution.

In the current commentary, we discuss why, regardless of journal mandates, it is imperative that contributors and reviewers consider a priori statistical power analyses, effect sizes, and ensure open disclosures about research practices. Sound research practices are essential to the vitality, credibility, and replicability of any field of research. However, a deep dive into all the aforementioned issues is beyond the scope of the current commentary; our commentary has two goals. First, we hope to initiate a conversation that focuses researchers’ attention on the replicability of previously published research in the field of gambling studies. Second, we want to shine a light on how (as a field) we can do better to ensure the replicability rate of future research allows all stakeholders to have a high degree of confidence in gambling-focused research.

Trouble in the henhouse: is the field of gambling studies heading toward a replication crisis?

It has long been known that statistical power is important to conducting informative research, and that the sample size employed should have at least 80% power to detect effects if they exist (see Cohen, Citation1965). Nonetheless, many studies in the psychological literature, which includes much of the field of gambling studies, is statistically underpowered (Bakker, van Dijk, & Wicherts, Citation2012; Cohen, Citation1990; Maxwell, Citation2004). In fact, the power of a typical two-group between-participant design has been estimated to be lower than 50% (Bakker et al., Citation2012; Stanley, Carter, & Doucouliagos, Citation2018)! Yet, the vast majority of published studies (90%) report p-values below the typical threshold for statistical significance (i.e. α = .05; Fanelli, Citation2010; Sterling, Rosenbaum, & Weinkam, Citation1995). Taken together, this means that many of the research papers published in social science journals (including this very journal) may be due to pure chance (or other shenanigans); not unlike the outcomes of electronic gaming machines. How could this be?

One culprit is publication bias (Rosenthal, Citation1979). Editors and reviewers have a strong tendency to accept for publication research that confirms reported hypotheses. That is, they tend to like a ‘clean story’ in which the tests of the hypotheses are statistically significant (ps < .05). The result is that large effect sizes with p-values that hover just under .05 are overrepresented in the published literature (Simmons, Nelson, & Simonsohn, Citation2011). A second culprit, owing perhaps to the first culprit, is that researchers tend to put studies that are not statistically significant and the corresponding materials (experimental conditions or variables) in a (now digital) file-drawer; even more perniciously, many analyses get left in the proverbial file drawer such that only a misleading subset of conditions, measures, and covariates are reported (John, Loewenstein, & Prelec, Citation2012; Simmons et al., Citation2011). Such selectivity can result in an inflated number of false positives appearing on the published record. A third culprit is that researchers have been trained explicitly (by supervisors and/or by reading Bem, Citation1987) or implicitly (via the review process) to HARK (Hypothesizing After the Results are Known). The result is that research reports are not necessarily written in line with the researcher’s original hypothesis. HARK-ing, akin to the other culprits in the replication crisis, has led to intellectually dishonest research reports. These culprits also raise the rate of false positives that appear in the published literature, thus undermining confidence in that literature (Kerr, Citation1998).

We want to be clear. We are not implying that researchers are consciously acting in bad faith (although, unfortunately, some do). It is easy for researchers to convince themselves that a secondary hypothesis or exploratory analysis was, in fact, what they had originally intended to study (see hindsight bias; Fischoff, Citation1975), or that something unexpected has been ‘discovered’ without considering that such exploration invalidates the interpretation of p-values. Additionally, norms develop in a field of study (e.g. HARK-ing) that can skew the research practices of researchers who frequently publish in that field. A collective effort is needed to improve research practices. However, a special burden is placed on the gatekeepers (editors and reviewers) to spur change.

Improving research practices (and the replication rate)

To improve the field of gambling studies, researchers must come to embrace open science practices (either on their own or via journal mandate). Open science begins and ends with full and accurate reporting of all aspects of the reported research (Cumming, Citation2014). To this end, current best practice is for researchers to, in advance of running the study, determine the hypothesis or hypotheses that will be tested and then pre-register their idea(s) as well as the plans for analysis on an open access research repository (e.g. Open Science Framework or AsPredicted; for pre-registration resources see https://cos.io/prereg/). This reduces HARK-ing by decreasing researchers’ degrees of freedom. Exploration is still possible, but pre-registration more clearly distinguishes exploration from confirmatory tests. A link to the pre-registration document should be included in the text of the article.

Included in the pre-registration document (as well as the text of the paper) should be the reporting of a priori power analyses. Despite the centrality of power in null-hypothesis significance testing (NHST; Gigerenzer, Citation2004), formal power analyses are rarely reported in the gambling literature. As previously stated, research in the social sciences (and likely the field of gambling studies) has been underpowered. The problem with underpowered (low N) studies is that the observed p-value is very unstable when small samples are drawn from the population (i.e. sampling variability; Cumming, Citation2014; Kline, Citation2004). The solution is not simply a matter of running large N studies (which has been made easier in the wake of some gambling operators’ willingness to allow researchers access to player-account data as well as crowdsourced data collection; e.g. Amazon’s Mechanical Turk). This is because the probability of observing a statistically significant effect increases as the sample size increases (Cohen, Citation1994; Meehl, Citation1978). With very large samples it is often more informative when a p-value is not significant than when it is significant.

Given the relation between the p-value and sample size, some researchers suggest the p-value should be either abandoned entirely (Ioannidis, Citation2005) or only interpreted alongside an effect size estimation (Cumming, Citation2012). Regardless of the direction the field of gambling studies takes, researchers should determine what constitutes a meaningful effect size before they conduct a study. For example, is a Cohen’s d of 0.01 sufficient to claim that a responsible gambling tool has a meaningful effect on limit adherence? What about d = 0.10? (see Funder & Ozer, Citation2019 for a discussion about what constitutes a reasonable effect size in psychological research). Once this determination is made, statistical programs like G*Power (Faul, Erdfelder, Lang, & Buchner, Citation2007) or Monte Carlo simulations can be used to establish the required N to detect that effect size. At present, the field of gambling studies is not engaging with such issues.

Researchers also need to be open about the method they employed. Simmons, Nelson, and Simonsohn (Citation2014) suggested that all published papers should explicitly note how sample size was determined and all data exclusions (if any; i.e. how many participants were excluded from the analysis and why). Additionally, researchers should note all manipulations and measures in the study, even if they were not included in the reported analyses. This will allow readers of a given research report to get a complete understanding of what participants experienced. Such explicit clarifications signal quality and rigour, which is an advantage over papers that omit such clarifications. Moreover, researchers are learning that open science practices facilitate replication – an issue to which we now turn.

A replication by any other name?

Given the weak replication rate in other fields of study, it is high time for the field of gambling studies to engage in some replication efforts. For those new to the credibility revolution (or for those who have not thought about replication deeply), it is important to underscore that a completely exact replication of an original study is impossible (see Rosenthal, Citation1991; Tsang & Kwan, Citation1999). Instead, the best that a researcher can hope for is a replication study that only differs from the original study in terms of the participants who take part (see Schmidt, Citation2009). However, such replications are difficult (if not impossible) to conduct due to contextual differences that exist between the time, culture, and space in which the original study was embedded and that of the replication study. For example, there have been substantial changes in the gambling landscape in Ontario – Ontario Lottery and Gaming (OLG) has sold all casinos in the province to private industry. For this reason, among others, the release of a responsible gambling tool developed by OLG and tested by Wohl, Davis, and Hollingshead (Citation2017) is uncertain. The tool in question asks players to estimate how much money they spent playing EGM’s at casinos and slot machine venues in Ontario and then provides real-time feedback about their actual expenditures. Wohl et al. (Citation2017) observed that the feedback resulted in a downregulation of gambling expenditures during a three-month follow-up period among those who underestimated their losses. Given the aforementioned changes to the gambling landscape in Ontario, a close replication of Wohl et al. (Citation2017) may be a difficult undertaking. Given this reality, it begs the question: What makes a good and convincing replication?

According to Brandt et al.'s (Citation2014) ‘replication recipe’, for researchers to conduct a convincing close replication they must first explicitly determine the effect of interest. If the original study was a 2 × 2 between-participants design, the researcher must decide whether the effect of interest is one of the main-effects or the interaction (or perhaps a simple slope). This should be pre-registered. The researcher must then follow the method of the original study as much as possible (e.g. participant recruitment, materials, analyses). In the best-case scenario, the researcher discusses the replication attempt with the researcher(s) who conducted the original study and receives the materials necessary to conduct a close replication. Of course, this is not always possible due to, among other things, lack of cooperation from the original researchers, loss of the original materials, or the researcher(s) who conducted the original study are no longer in academia or deceased. Close replications become more difficult to conduct as the number of deviations mount because deviations introduce researcher degrees of freedom into the replication (i.e. judgement calls; see Simons, Shoda, & Lindsay, Citation2017).

Where close replications are not possible, researchers could attempt a conceptual replication. For example, if a replication of Wohl et al. (Citation2017) was attempted, researchers could test the idea that many players are extraordinarily poor estimators of their gambling expenditures (time and money), and that making underestimates of expenditure salient to players can reduce their subsequent expenditures. Regardless of whether the replication is ‘close’ or ‘conceptual’, any deviation from the original study should be noted in text, and the results of the replication interpreted in light of those differences. The original researcher can help minimize the degrees of freedom of those who wish to conduct a replication by openly archiving materials and being explicit about the conditions under which they expect their results to generalize or not (Simons et al., Citation2017).

Next, the replication must have high statistical power. Simonsohn (Citation2015) suggested that replication studies should have 2.5 times the original number of participants. This ensures that there is good power to detect the original effect (if true), and – importantly – also large enough to demonstrate that the original effect was implausible if null results are found in the replication study (notwithstanding alternative explanations around method or sample differences). This guideline is useful for ‘lab sized’ studies, but less applicable in large-N studies where the critical issue is less about potentially low statistical power and more about whether the effect size in question is large enough to be of theoretical and practical interest. Ultimately, central to a replication is that the researcher uses a sample size that allows for the detection of the effect size of interest; the 2.5 times guideline for smaller-N studies assumes that the effect size of interest is one that the original study had a reasonable chance of detecting. In very large-N studies, a more theoretical or practical approach to effect size and power must be taken. That is, statistically ‘significant’ effects may be trivially small and not practically significant (and are then also more plausibly due to idiosyncratic causes or confounds). Very large-N (original) studies have the benefit of estimating effect sizes very accurately. As such, replication studies are on firmer ground using the original effect size to power a replication effort.

A replication study may even use a smaller sample if replicating a large-N study's effect because there is firm justification based on a large original effect size. Alternatively, replication of a large-N study's original effect may use a smaller sample size if the replication is still well powered to detect the smallest effect size of theoretical or practical importance. For example, if a 50,000 participant study found a large correlation between trait impulsivity and gambling losses, it would not take a similarly large-N study to persuasively replicate this finding. If that same large effect were found in a small-N study, the 2.5 times sample size rule applies better.

In the spirit of open science, the researcher who conducts a replication (akin to those who conducted the original study) should make all details of the replication open access (i.e. freely and publicly available). Although we also encourage researchers of original data to do that, open access takes on a special importance in replications, especially where there is a failure to replicate. All researchers should be able to evaluate the ‘closeness’ of the replication, and how convincing the outcome is, for themselves.

The reaction of the research community: a cautionary conclusion

From a positivist perspective, with enough sound empirical research, the ‘truth’ can be approximated. Getting closer to ‘truth’ starts with sound, original open science. Researchers in the social sciences (and the Editors of International Gambling Studies) are coming to understand that the path to truth is also paved with replications. There will be bumps along the way. Theory and research findings that are now sacred cows may need to be revised or discarded. It is important that when that happens the research community refrains from ad hominem attacks on the researcher(s) who conducted the original research and/or the replicators. It is worth repeating that no replication is direct or exact. Science is a cumulative endeavour. As the field of gambling studies opens the door to replications, those who work in the field should keep an open mind as replications come online. The research conducted in this field aims to understand and prevent disordered gambling and help those who have a gambling disorder recover. This should be the focus. It will be heartening to see research replicate. However, some of the findings in the field of gambling studies will fail to replicate. When this occurs introspection should be done to understand why the original study failed to replicate, whether the effect is real (or not), and how to improve our science. Real players (present and past) are affected by the work the field of gambling studies produces.

We hope this commentary initiates a needed conversation about the state of the literature in the field of gambling studies, why replication and transparency are important, and how the results of attempts to replicate are framed. Perhaps more importantly, however, we hope this commentary serves as a call to researchers in the field of gambling studies to adopt open science practices. We suggest researchers read the Open Science Collaboration (Citation2017) chapter about workflow as a pragmatic place to learn about how to open their science. A move to accept replication studies is hollow without a parallel move toward open science.

Conflict of interest

Competing interest

Dr. Michael J. A. Wohl. Funding: Dr. Wohl has received gambling-related research funding from provincial granting agencies in Canada. He has also received direct and/or indirect research funds from the gambling industry in Canada, United States of America, United Kingdom, Australia and Sweden. Additionally, he has served as a consultant for the gambling industry in Canada, United States of American, New Zealand, and Australia. Funds were also awarded by federal granting agencies in Canada and Australia for research unrelated to gambling. A detailed list can be found on his curriculum vitae (http://carleton.ca/bettermentlabs/wp-content/uploads/CV.pdf). No funds were provided to Wohl for the preparation of the current paper. Constraints on publishing: There are no constrains on publishing. Competing interests: There are no competing interest to declare.

Dr. Nassim Tabri. Funding: Dr. Tabri has received gambling-related research funding from Gambling Research Exchange Ontario (GREO)—a provincial granting agency in Canada. He has also received indirect research funds from the gambling industry in Canada, United States, and the United Kingdom. Additionally, he has served as a consultant for the gambling industry in Canada and the United States. Funds were also awarded by federal granting agencies in Canada for research unrelated to gambling. No funds were provided to Tabri for the preparation of the current paper. Constraints on publishing: There are no constrains on publishing. Competing interests: There are no competing interest to declare.

Dr. John M. Zelenski. Funding: Dr. Zelenski has received no funding for addiction-related research, nor has he consulted on these issues with industry. Constraints on publishing: There are no constrains on publishing. Competing interests: There are no competing interest to declare.

All authors declare that there are no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Correspondence concerning this article should be addressed to Michael J. A. Wohl, Department of Psychology, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, Canada, K1S 5B6. Tel: (902) 520-2600 x 2908, E-mail: [email protected]

References

  • Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7, 543–554.
  • Bem, D. (1987). Writing the empirical journal article. In M. P. Zanna & J. M. Darley (Eds.), The compleat academic: A practical guide for the beginning social scientist (pp. 171–204). Mahweh, NJ: Lawrence Earlbaum Associates, Inc.
  • Brandt, M. J., Ijzerman, H., Dijksterhius, A., Rarach, F. J., Geller, J., Giner-Sorolla, R., … Van’t Veer, A. (2014). The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology, 50, 217–224.
  • Camerer, C. F., Dreber, A., Forsell, E., Ho, T.-H., Huber, J., Johannesson, M., … Wu, H. (2016). Evaluating replicability of laboratory experiments in economics. Science, 351(6280), 1433–1436.
  • Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T.-H., Huber, J., Johannesson, M., … Wu, H. (2018). Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour, 2, 637–644.
  • Cohen, J. (1965). Some statistical issues in psychological research. In B. B. Wolman (Ed.), Handbook of clinical psychology (pp. 95–121). New York: McGraw-Hill.
  • Cohen, J. (1990). Things I have learned (thus far). American Psychologist, 45, 1304–1312.
  • Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997–1003.
  • Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. New York: Routledge.
  • Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25, 7–29.
  • Fanelli, D. (2010). “Positive” results increase down the hierarchy of the sciences. PLoS ONE, 5(4), Article e10068.
  • Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191.
  • Fischoff, B. (1975). Hindsight is not equal to foresight: The effect of outcome knowledge on judgment under uncertainty. Journal of Experimental Psychology: Human Perception and Performance, 104, 288–299.
  • Funder, D. C., & Ozer, D. J. (2019). Evaluating effect size in psychological research: Sense and nonsense. Advances in Methods and Practices in Psychological Science, 2, 156–168.
  • Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33, 587–606.
  • Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2, e124.
  • John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23, 524–532.
  • Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2, 196–217.
  • Kline, R. B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: American Psychological Association.
  • Maxwell, S. E. (2004). The persistence of underpowered studies in psychological research: Causes, consequences, and remedies. Psychological Methods, 9, 147–163.
  • Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46, 806–834.
  • Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349, aac4716.
  • Open Science Collaboration. (2017). Maximizing the reproducibility of your research. In S. O. Lilienfeld & I. D. Waldman (Eds.), Psychological science under scrutiny: Recent challenges and proposed solutions (pp. 1–21). New York: Wiley.
  • Popper, K. R. (1959). The logic of scientific discovery. London: Hutchinson.
  • Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86, 638–641.
  • Rosenthal, R. (1991). Replication in behavioral research. In J. W. Neuliep (Ed.), Replication research in the social sciences (pp. 1–30). Newbury Park, CA: Sage.
  • Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, 13, 90–100.
  • Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366.
  • Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2014). P-curve and effect size: Correcting for publication bias using only significant results. Perspectives on Psychological Science, 9, 666–681.
  • Simons, D. J., Shoda, Y., & Lindsay, D. S. (2017). Constraints on generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological Science, 12, 1123–1128.
  • Simonsohn, U. (2015). Small telescopes: Detectability and the evaluation of replication results. Psychological Science, 26, 559–569.
  • Stanley, T. D., Carter, E. C., & Doucouliagos, H. (2018). What meta-analyses reveal about the replicability of psychological research. Psychological Bulletin, 144, 1325–1346.
  • Sterling, T. D., Rosenbaum, W. L., & Weinkam, J. J. (1995). Publication decisions revisited: The effect of the outcome of statistical tests on the decision to publish and vice versa. The American Statistician, 49, 108–112.
  • Tsang, E. W., & Kwan, K. M. (1999). Replication and theory development in organizational science: A critical realist perspective. Academy of Management Review, 24, 759–780.
  • Wohl, M. J. A., Davis, C. G., & Hollingshead, S. J. (2017). How much have you won or lost? Personalized behavioral feedback about gambling expenditures regulates play. Computers in Human Behavior, 70, 437–445.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.