Search in:

The American Statistician Volume 70, 2016 - Issue 2

Submit an article Journal homepage

Free access

745,384

Views

3,794

CrossRef citations to date

Altmetric

Listen

Editorial

The ASA Statement on p-Values: Context, Process, and Purpose

Ronald L. WassersteinAmerican Statistical Association, 732 North Washington Street, Alexandria, VA22314-1943Correspondence[email protected]

Nicole A. LazarAmerican Statistical Association, 732 North Washington Street, Alexandria, VA22314-1943

Pages 129-133 | Published online: 09 Jun 2016

Cite this article
https://doi.org/10.1080/00031305.2016.1154108
CrossMark

In this article

The ASA's Statement on p-Values: Context, Process, and Purpose
- Guide to the Online Supplemental Material to the ASA Statement on P-Values and Statistical Significance
- Supplemental Material to the ASA Statement on P-Values and Statistical Significance
ASA Statement on Statistical Significance and P-Values
Supplemental material

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions
View PDF PDF

The ASA's Statement on p-Values: Context, Process, and Purpose

Ronald L. WassersteinNicole A. Lazar

Pages 129-133

In February 2014, George Cobb, Professor Emeritus of Mathematics and Statistics at Mount Holyoke College, posed these questions to an ASA discussion forum:

Q: Why do so many colleges and grad schools teach p = 0.05?

A: Because that's still what the scientific community and journal editors use.

Q: Why do so many people still use p = 0.05?

A: Because that's what they were taught in college or grad school.

Cobb's concern was a long-worrisome circularity in the sociology of science based on the use of bright lines such as p < 0.05: “We teach it because it's what we do; we do it because it's what we teach.” This concern was brought to the attention of the ASA Board.

The ASA Board was also stimulated by highly visible discussions over the last few years. For example, ScienceNews (Siegfried Citation2010) wrote: “It's science's dirtiest secret: The ‘scientific method’ of testing hypotheses by statistical analysis stands on a flimsy foundation.” A November 2013, article in Phys.org Science News Wire (Citation2013) cited “numerous deep flaws” in null hypothesis significance testing. A ScienceNews article (Siegfried Citation2014) on February 7, 2014, said “statistical techniques for testing hypotheses…have more flaws than Facebook's privacy policies.” A week later, statistician and “Simply Statistics” blogger Jeff Leek responded. “The problem is not that people use P-values poorly,” Leek wrote, “it is that the vast majority of data analysis is not performed by people properly trained to perform data analysis” (Leek Citation2014). That same week, statistician and science writer Regina Nuzzo published an article in Nature entitled “Scientific Method: Statistical Errors” (Nuzzo Citation2014). That article is now one of the most highly viewed Nature articles, as reported by altmetric.com (http://www.altmetric.com/details/2115792#score).

Of course, it was not simply a matter of responding to some articles in print. The statistical community has been deeply concerned about issues of reproducibility and replicability of scientific conclusions. Without getting into definitions and distinctions of these terms, we observe that much confusion and even doubt about the validity of science is arising. Such doubt can lead to radical choices, such as the one taken by the editors of Basic and Applied Social Psychology, who decided to ban p-values (null hypothesis significance testing) (Trafimow and Marks Citation2015). Misunderstanding or misuse of statistical inference is only one cause of the “reproducibility crisis” (Peng Citation2015), but to our community, it is an important one.

When the ASA Board decided to take up the challenge of developing a policy statement on p-values and statistical significance, it did so recognizing this was not a lightly taken step. The ASA has not previously taken positions on specific matters of statistical practice. The closest the association has come to this is a statement on the use of value-added models (VAM) for educational assessment (Morganstein and Wasserstein Citation2014) and a statement on risk-limiting post-election audits (American Statistical Association Citation2010). However, these were truly policy-related statements. The VAM statement addressed a key educational policy issue, acknowledging the complexity of the issues involved, citing limitations of VAMs as effective performance models, and urging that they be developed and interpreted with the involvement of statisticians. The statement on election auditing was also in response to a major but specific policy issue (close elections in 2008), and said that statistically based election audits should become a routine part of election processes.

By contrast, the Board envisioned that the ASA statement on p-values and statistical significance would shed light on an aspect of our field that is too often misunderstood and misused in the broader research community, and, in the process, provides the community a service. The intended audience would be researchers, practitioners, and science writers who are not primarily statisticians. Thus, this statement would be quite different from anything previously attempted.

The Board tasked Wasserstein with assembling a group of experts representing a wide variety of points of view. On behalf of the Board, he reached out to more than two dozen such people, all of whom said they would be happy to be involved. Several expressed doubt about whether agreement could be reached, but those who did said, in effect, that if there was going to be a discussion, they wanted to be involved.

Over the course of many months, group members discussed what format the statement should take, tried to more concretely visualize the audience for the statement, and began to find points of agreement. That turned out to be relatively easy to do, but it was just as easy to find points of intense disagreement.

The time came for the group to sit down together to hash out these points, and so in October 2015, 20 members of the group met at the ASA Office in Alexandria, Virginia. The 2-day meeting was facilitated by Regina Nuzzo, and by the end of the meeting, a good set of points around which the statement could be built was developed.

The next 3 months saw multiple drafts of the statement, reviewed by group members, by Board members (in a lengthy discussion at the November 2015 ASA Board meeting), and by members of the target audience. Finally, on January 29, 2016, the Executive Committee of the ASA approved the statement.

The statement development process was lengthier and more controversial than anticipated. For example, there was considerable discussion about how best to address the issue of multiple potential comparisons (Gelman and Loken Citation2014). We debated at some length the issues behind the words “a p-value near 0.05 taken by itself offers only weak evidence against the null hypothesis” (Johnson Citation2013). There were differing perspectives about how to characterize various alternatives to the p-value and in how much detail to address them. To keep the statement reasonably simple, we did not address alternative hypotheses, error types, or power (among other things), and not everyone agreed with that approach.

As the end of the statement development process neared, Wasserstein contacted Lazar and asked if the policy statement might be appropriate for publication in The American Statistician (TAS). After consideration, Lazar decided that TAS would provide a good platform to reach a broad and general statistical readership. Together, we decided that the addition of an online discussion would heighten the interest level for the TAS audience, giving an opportunity to reflect the aforementioned controversy.

To that end, a group of discussants was contacted to provide comments on the statement. You can read their statements in the online supplement, and a guide to those statements appears at the end of this editorial. We thank Naomi Altman, Douglas Altman, Daniel J. Benjamin, Yoav Benjamini, Jim Berger, Don Berry, John Carlin, George Cobb, Andrew Gelman, Steve Goodman, Sander Greenland, John Ioannidis, Joseph Horowitz, Valen Johnson, Michael Lavine, Michael Lew, Rod Little, Deborah Mayo, Michele Millar, Charles Poole, Ken Rothman, Stephen Senn, Dalene Stangl, Philip Stark and Steve Ziliak for sharing their insightful perspectives.

Of special note is the following article, which is a significant contribution to the literature about p-values and statistical significance.

Greenland, S., Senn, S.J., Rothman, K.J., Carlin, J.B., Poole, C., Goodman, S.N. and Altman, D.G.: ``Statistical Tests, P-values, Confidence Intervals, and Power: A Guide to Misinterpretations.’’

Though there was disagreement on exactly what the statement should say, there was high agreement that the ASA should be speaking out about these matters.

Let us be clear. Nothing in the ASA statement is new. Statisticians and others have been sounding the alarm about these matters for decades, to little avail. We hoped that a statement from the world's largest professional association of statisticians would open a fresh discussion and draw renewed and vigorous attention to changing the practice of science with regards to the use of statistical inference.

Guide to the Online Supplemental Material to the ASA Statement on P-Values and Statistical Significance

Many of the participants in the development of the ASA statement contributed commentary about the statement or matters related to it. Their comments are posted as online supplements to the statement. We provide here a list of the supplemental articles.

Supplemental Material to the ASA Statement on P-Values and Statistical Significance

Altman, Naomi: Ideas from multiple testing of high dimensional data provide insights about reproducibility and false discovery rates of hypothesis supported by p-values
Benjamin, Daniel J, and Berger, James O: A simple alternative to p-values
Benjamini, Yoav: It's not the p-values’ fault
Berry, Donald A: P-values are not what they're cracked up to be
Carlin, John B: Comment: Is reform possible without a paradigm shift?
Cobb, George: ASA statement on p-values: Two consequences we can hope for
Gelman, Andrew: The problems with p-values are not just with p-values
Goodman, Steven N: The next questions: Who, what, when, where, and why?
Greenland, Sander: The ASA guidelines and null bias in current teaching and practice
Ioannidis, John P.A.: Fit-for-purpose inferential methods: abandoning/changing P-values versus abandoning/changing research
Johnson, Valen E.: Comments on the “ASA Statement on Statistical Significance and P-values" and marginally significant p-values
Lavine, Michael, and Horowitz, Joseph: Comment
Lew, Michael J: Three inferential questions, two types of P-value
Little, Roderick J: Discussion
Mayo, Deborah G: Don't throw out the error control baby with the bad statistics bathwater
Millar, Michele: ASA statement on p-values: some implications for education
Rothman, Kenneth J: Disengaging from statistical significance
Senn, Stephen: Are P-Values the Problem?
Stangl, Dalene: Comment
Stark, P.B.: The value of p-values
Ziliak, Stephen T: The significance of the ASA statement on statistical significance and p-values

Ronald L.Wasserstein and Nicole A. Lazar

[email protected]

American Statistical Association, 732 NorthWashington Street,

Alexandria, VA 22314-1943.

Supplemental material

References

American Statistical Association (2010), “ASA Statement on Risk-Limiting Post Election Audits.” Available at http://www.amstat.org/policy/pdfs/Risk-Limiting_Endorsement.pdf.
Google Scholar
Gelman, A., and Loken, E. (2014), “The Statistical Crisis in Science [online],” American Scientist, 102. Available at http://www.american-scientist.org/issues/feature/2014/6/the-statistical-crisis-in-science.
Web of Science ®Google Scholar
Johnson, V. E. (2013), “Uniformly Most Powerful Bayesian Tests,” Annals of Statistics, 41, 1716–1741.
PubMed Web of Science ®Google Scholar
Leek, J. (2014), “On the Scalability of Statistical Procedures: Why the p-Value Bashers Just Don't Get It,” Simply Statistics Blog, Available at http://simplystatistics.org/2014/02/14/on-the-scalability-of-statistical-procedures-why-the-p-value-bashers-just-dont-get-it/.
Google Scholar
Morganstein, D., and Wasserstein, R. (2014), “ASA Statement on Value-Added Models,” Statistics and Public Policy, 1, 108–110. Available at http://amstat.tandfonline.com/doi/full/10.1080/2330443X.2014.956906.
Google Scholar
Nuzzo, R. (2014), “Scientific Method: Statistical Errors,” Nature, 506, 150–152. Available at http://www.nature.com/news/scientific-method-statistical-errors-1.14700.
PubMed Web of Science ®Google Scholar
Peng, R. (2015), “The Reproducibility Crisis in Science: A Statistical Counterattack,” Significance, 12, 30–32.
Google Scholar
Phys.org Science News Wire (2013), “The Problem With p Values: How Significant are They, Really?” Available at http://phys.org/wire-news/145707973/the-problem-with-p-values-how-significant-are-they-really.html.
Google Scholar
Siegfried, T. (2010), “Odds Are, It's Wrong: Science Fails to Face the Shortcomings of Statistics,” Science News, 177, 26. Available at https://www.sciencenews.org/article/odds-are-its-wrong.
Google Scholar
Siegfried, T. (2014), “To Make Science Better, Watch out for Statistical Flaws,” Science News Context Blog, February 7, 2014. Available at https://www.sciencenews.org/blog/context/make-science-better-watch-out-statistical-flaws.
Google Scholar
Trafimow, D., and Marks, M. (2015), “Editorial,” Basic and Applied Social Psychology 37, 1–2.
Google Scholar

ASA Statement on Statistical Significance and P-Values

Ronald L. Wasserstein

Pages 129-133

1. Introduction

Increased quantification of scientific research and a proliferation of large, complex datasets in recent years have expanded the scope of applications of statistical methods. This has created new avenues for scientific progress, but it also brings concerns about conclusions drawn from research data. The validity of scientific conclusions, including their reproducibility, depends on more than the statistical methods themselves. Appropriately chosen techniques, properly conducted analyses and correct interpretation of statistical results also play a key role in ensuring that conclusions are sound and that uncertainty surrounding them is represented properly.

Underpinning many published scientific conclusions is the concept of “statistical significance,” typically assessed with an index called the p-value. While the p-value can be a useful statistical measure, it is commonly misused and misinterpreted. This has led to some scientific journals discouraging the use of p-values, and some scientists and statisticians recommending their abandonment, with some arguments essentially unchanged since p-values were first introduced.

In this context, the American Statistical Association (ASA) believes that the scientific community could benefit from a formal statement clarifying several widely agreed upon principles underlying the proper use and interpretation of the p-value. The issues touched on here affect not only research, but research funding, journal practices, career advancement, scientific education, public policy, journalism, and law. This statement does not seek to resolve all the issues relating to sound statistical practice, nor to settle foundational controversies. Rather, the statement articulates in nontechnical terms a few select principles that could improve the conduct or interpretation of quantitative science, according to widespread consensus in the statistical community.

2. What is a p-Value?

Informally, a p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.

3. Principles

1.	P-values can indicate how incompatible the data are with a specified statistical model. A p-value provides one approach to summarizing the incompatibility between a particular set of data and a proposed model for the data. The most common context is a model, constructed under a set of assumptions, together with a so-called “null hypothesis.” Often the null hypothesis postulates the absence of an effect, such as no difference between two groups, or the absence of a relationship between a factor and an outcome. The smaller the p-value, the greater the statistical incompatibility of the data with the null hypothesis, if the underlying assumptions used to calculate the p-value hold. This incompatibility can be interpreted as casting doubt on or providing evidence against the null hypothesis or the underlying assumptions.
2.	P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone. Researchers often wish to turn a p-value into a statement about the truth of a null hypothesis, or about the probability that random chance produced the observed data. The p-value is neither. It is a statement about data in relation to a specified hypothetical explanation, and is not a statement about the explanation itself.
3.	Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold. Practices that reduce data analysis or scientific inference to mechanical “bright-line” rules (such as “p < 0.05”) for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision making. A conclusion does not immediately become “true” on one side of the divide and “false” on the other. Researchers should bring many contextual factors into play to derive scientific inferences, including the design of a study, the quality of the measurements, the external evidence for the phenomenon under study, and the validity of assumptions that underlie the data analysis. Pragmatic considerations often require binary, “yes-no” decisions, but this does not mean that p-values alone can ensure that a decision is correct or incorrect. The widespread use of “statistical significance” (generally interpreted as “p ≤ 0.05”) as a license for making a claim of a scientific finding (or implied truth) leads to considerable distortion of the scientific process.
4.	Proper inference requires full reporting and transparency P-values and related analyses should not be reported selectively. Conducting multiple analyses of the data and reporting only those with certain p-values (typically those passing a significance threshold) renders the reported p-values essentially uninterpretable. Cherry-picking promising findings, also known by such terms as data dredging, significance chasing, significance questing, selective inference, and “p-hacking,” leads to a spurious excess of statistically significant results in the published literature and should be vigorously avoided. One need not formally carry out multiple statistical tests for this problem to arise: Whenever a researcher chooses what to present based on statistical results, valid interpretation of those results is severely compromised if the reader is not informed of the choice and its basis. Researchers should disclose the number of hypotheses explored during the study, all data collection decisions, all statistical analyses conducted, and all p-values computed. Valid scientific conclusions based on p-values and related statistics cannot be drawn without at least knowing how many and which analyses were conducted, and how those analyses (including p-values) were selected for reporting.
5.	A p-value, or statistical significance, does not measure the size of an effect or the importance of a result. Statistical significance is not equivalent to scientific, human, or economic significance. Smaller p-values do not necessarily imply the presence of larger or more important effects, and larger p-values do not imply a lack of importance or even lack of effect. Any effect, no matter how tiny, can produce a small p-value if the sample size or measurement precision is high enough, and large effects may produce unimpressive p-values if the sample size is small or measurements are imprecise. Similarly, identical estimated effects will have different p-values if the precision of the estimates differs.
6.	By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis. Researchers should recognize that a p-value without context or other evidence provides limited information. For example, a p-value near 0.05 taken by itself offers only weak evidence against the null hypothesis. Likewise, a relatively large p-value does not imply evidence in favor of the null hypothesis; many other hypotheses may be equally or more consistent with the observed data. For these reasons, data analysis should not end with the calculation of a p-value when other approaches are appropriate and feasible.

4. Other Approaches

In view of the prevalent misuses of and misconceptions concerning p-values, some statisticians prefer to supplement or even replace p-values with other approaches. These include methods that emphasize estimation over testing, such as confidence, credibility, or prediction intervals; Bayesian methods; alternative measures of evidence, such as likelihood ratios or Bayes Factors; and other approaches such as decision-theoretic modeling and false discovery rates. All these measures and approaches rely on further assumptions, but they may more directly address the size of an effect (and its associated uncertainty) or whether the hypothesis is correct.

5. Conclusion

Good statistical practice, as an essential component of good scientific practice, emphasizes principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean. No single index should substitute for scientific reasoning.

Edited by Ronald L.Wasserstein, Executive Director

On behalf of the American Statistical Association

Board of Directors

Supplemental material

00_GREENLAND-ETAL.PDF

Download PDF (115.5 KB)

01_ALTMAN.PDF

Download PDF (38.2 KB)

02_BENJAMIN_BERGER.PDF

Download PDF (46.3 KB)

03_BENJAMINI.PDF

Download PDF (38.6 KB)

04_BERRY.PDF

Download PDF (41.5 KB)

05_CARLIN.PDF

Download PDF (40.6 KB)

06_COBB.PDF

Download PDF (32.2 KB)

07_GELMAN.PDF

Download PDF (39.7 KB)

08_GOODMAN.PDF

Download PDF (37 KB)

09_GREENLAND.PDF

Download PDF (45.3 KB)

10_IOANNIDIS.PDF

Download PDF (42.5 KB)

11_JOHNSON.PDF

Download PDF (71.6 KB)

12_LAVINE.PDF

Download PDF (42.6 KB)

13_LEW.PDF

Download PDF (44.3 KB)

14_LITTLE.PDF

Download PDF (30.6 KB)

15_MAYO.PDF

Download PDF (43.9 KB)

16_MILLAR.PDF

Download PDF (30.7 KB)

17_ROTHMAN.PDF

Download PDF (35.5 KB)

18_SENN.PDF

Download PDF (43.4 KB)

19_STANGL.PDF

Download PDF (25.1 KB)

20_STARK.PDF

Download PDF (45.5 KB)

21_ZILIAK.PDF

Download PDF (45.9 KB)

ASA-Statement-Spanish.pdf

Download PDF (40.7 KB)

Acknowledgments

The ASA Board of Directors thanks the following people for sharing their expertise and perspectives during the development of the statement. The statement does not necessarily reflect the viewpoint of all these people, and in fact some have views that are in opposition to all or part of the statement. Nonetheless, we are deeply grateful for their contributions. Naomi Altman, Jim Berger, Yoav Benjamini, Don Berry, Brad Carlin, John Carlin, George Cobb, Marie Davidian, Steve Fienberg, Andrew Gelman, Steve Goodman, Sander Greenland, Guido Imbens, John Ioannidis, Valen Johnson, Michael Lavine, Michael Lew, Rod Little, Deborah Mayo, Chuck McCulloch, Michele Millar, Sally Morton, Regina Nuzzo, Hilary Parker, Kenneth Rothman, Don Rubin, Stephen Senn, Uri Simonsohn, Dalene Stangl, Philip Stark, Steve Ziliak.

A Brief p-Values and Statistical Significance Reference List

Altman D.G., and Bland J.M. (1995), ‘‘Absence of Evidence is not Evidence of Absence,’’ British Medical Journal, 311, 485.
PubMed Web of Science ®Google Scholar
Altman, D.G., Machin, D., Bryant, T.N., and Gardner, M.J. (eds.) (2000), Statistics with Confidence (2nd ed.), London: BMJ Books.
Google Scholar
Berger, J.O., and Delampady, M. (1987), ‘‘Testing Precise Hypotheses,’’ Statistical Science, 2, 317–335.
Google Scholar
Berry, D. (2012), “Multiplicities in Cancer Research: Ubiquitous and Necessary Evils,” Journal of the National Cancer Institute, 104, 1124–1132.
PubMedGoogle Scholar
Christensen, R. (2005), “Testing Fisher, Neyman, Pearson, and Bayes,” The American Statistician, 59, 121–126.
Web of Science ®Google Scholar
Cox, D.R. (1982), “Statistical Significance Tests,” British Journal of Clinical Pharmacology, 14, 325–331.
PubMed Web of Science ®Google Scholar
Edwards, W., Lindman, H., and Savage, L.J. (1963), “Bayesian Statistical Inference for Psychological Research,” Psychological Review, 70, 193–242.
Web of Science ®Google Scholar
Gelman, A., and Loken, E. (2014), “The Statistical Crisis in Science [online],” American Scientist, 102. Available at http://www.americanscientist.org/issues/feature/2014/6/the-statistical-crisis-in-science
Google Scholar
Gelman, A., and Stern, H.S. (2006), “The Difference Between ‘Significant’ and ‘Not Significant’ is not Itself Statistically Significant,” The American Statistician, 60, 328–331.
Web of Science ®Google Scholar
Gigerenzer, G. (2004), “Mindless Statistics,” Journal of Socioeconomics, 33, 567–606.
Google Scholar
Goodman, S.N. (1999a), “Toward Evidence-Based Medical Statistics 1: The P-Value Fallacy,” Annals of Internal Medicine, 130, 995–1004.
PubMed Web of Science ®Google Scholar
——— (1999b), “Toward Evidence-Based Medical Statistics. 2: The Bayes Factor,” Annals of Internal Medicine, 130, 1005–1013.
PubMed Web of Science ®Google Scholar
——— (2008), “A Dirty Dozen: Twelve P-Value Misconceptions,” Seminars in Hematology, 45, 135–140.
PubMed Web of Science ®Google Scholar
Greenland, S. (2011), “Null Misinterpretation in Statistical Testing and its Impact on Health Risk Assessment,” Preventive Medicine, 53, 225–228.
PubMed Web of Science ®Google Scholar
——— (2012), ‘‘Nonsignificance Plus High Power Does Not Imply Support for the Null Over the Alternative,’’ Annals of Epidemiology, 22, 364–368.
PubMed Web of Science ®Google Scholar
Greenland, S., and Poole, C. (2011), “Problems in Common Interpretations of Statistics in Scientific Articles, Expert Reports, and Testimony,” Jurimetrics, 51, 113–129.
Google Scholar
Hoenig, J.M., and Heisey, D.M. (2001), ‘‘The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis,’’ The American Statistician, 55, 19–24.
Web of Science ®Google Scholar
Ioannidis, J.P. (2005), “Contradicted and Initially Stronger Effects in Highly Cited Clinical Research,” Journal of the American Medical Association, 294, 218–228.
PubMed Web of Science ®Google Scholar
——— (2008), “Why Most Discovered True Associations are Inflated’’ (with discussion), Epidemiology 19, 640–658.
PubMed Web of Science ®Google Scholar
Johnson, V.E. (2013), “Revised Standards for Statistical Evidence,” Proceedings of the National Academy of Sciences, 110(48), 19313–19317.
PubMed Web of Science ®Google Scholar
——— (2013), “Uniformly Most Powerful Bayesian Tests,” Annals of Statistics, 41, 1716–1741.
PubMed Web of Science ®Google Scholar
Lang, J., Rothman K.J., and Cann, C.I. (1998), “That Confounded P-value’’ (editorial), Epidemiology, 9, 7–8.
PubMed Web of Science ®Google Scholar
Lavine, M. (1999), “What is Bayesian Statistics and Why Everything Else is Wrong,” UMAP Journal, 20, 2.
Google Scholar
Lew, M.J. (2012), “Bad Statistical Practice in Pharmacology (and Other Basic Biomedical Disciplines): You Probably Don't Know P,” British Journal of Pharmacology, 166, 5, 1559–1567.
PubMed Web of Science ®Google Scholar
Phillips, C.V. (2004), “Publication Bias In Situ,” BMC Medical Research Methodology, 4, 20.
PubMedGoogle Scholar
Poole, C. (1987), “Beyond the Confidence Interval,” American Journal of Public Health, 77, 195–199.
PubMed Web of Science ®Google Scholar
——— (2001), ‘‘Low P-values or Narrow Confidence Intervals: Which are More Durable?’’ Epidemiology, 12, 291–294.
PubMed Web of Science ®Google Scholar
Rothman, K.J. (1978), “A Show of Confidence’’ (editorial), New England Journal of Medicine, 299, 1362–1363.
PubMed Web of Science ®Google Scholar
——— (1986), “Significance Questing’’ (editorial), Annals of Internal Medicine, 105, 445–447.
PubMed Web of Science ®Google Scholar
———- (2010), “Curbing Type I and Type II Errors,” European Journal of Epidemiology, 25, 223–224.
PubMed Web of Science ®Google Scholar
Rothman, K.J., Weiss, N.S., Robins, J., Neutra, R., and Stellman, S. (1992), “Amicus Curiae Brief for the U. S. Supreme Court, Daubert v. Merrell Dow Pharmaceuticals, Petition for Writ of Certiorari to the United States Court of Appeals for the Ninth Circuit,” No. 92-102, October Term, 1992.
Google Scholar
Rozeboom, W.M. (1960), “The Fallacy of the Null-Hypothesis Significance Test,” Psychological Bulletin, 57, 416–428.
PubMed Web of Science ®Google Scholar
Schervish, M.J. (1996), “P-Values: What They Are and What They Are Not,” The American Statistician, 50, 203–206.
Web of Science ®Google Scholar
Simmons, J.P., Nelson, L.D., and Simonsohn, U. (2011), “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant,” Psychological Science, 22, 1359–1366.
PubMed Web of Science ®Google Scholar
Stang, A., and Rothman, K.J. (2011), “That Confounded P-value Revisited,” Journal of Clinical Epidemiology, 64, 1047–1048.
PubMed Web of Science ®Google Scholar
Stang, A., Poole, C., and Kuss, O. (2010), “The Ongoing Tyranny of Statistical Significance Testing in Biomedical Research,” European Journal of Epidemiology, 25, 225–230.
PubMed Web of Science ®Google Scholar
Sterne, J. A. C. (2002). “Teaching Hypothesis Tests—Time for Significant Change?” Statistics in Medicine, 21, 985–994.
PubMed Web of Science ®Google Scholar
Sterne, J. A. C., and Smith, G. D. (2001), “Sifting the Evidence—What's Wrong with Significance Tests?” British Medical Journal, 322, 226–231.
PubMed Web of Science ®Google Scholar
Ziliak, S.T. (2010), “The Validus Medicus and a New Gold Standard,” The Lancet, 376, 9738, 324–325.
PubMed Web of Science ®Google Scholar
Ziliak, S.T., and McCloskey, D.N. (2008), The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives, Ann Arbor, MI: University of Michigan Press.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

The ASA Statement on p-Values: Context, Process, and Purpose

The ASA's Statement on p-Values: Context, Process, and Purpose

Guide to the Online Supplemental Material to the ASA Statement on P-Values and Statistical Significance

Supplemental Material to the ASA Statement on P-Values and Statistical Significance

00_GREENLAND-ETAL.PDF

01_ALTMAN.PDF

02_BENJAMIN_BERGER.PDF

03_BENJAMINI.PDF

04_BERRY.PDF

05_CARLIN.PDF

06_COBB.PDF

07_GELMAN.PDF

08_GOODMAN.PDF

09_GREENLAND.PDF

10_IOANNIDIS.PDF

11_JOHNSON.PDF

12_LAVINE.PDF

13_LEW.PDF

14_LITTLE.PDF

15_MAYO.PDF

16_MILLAR.PDF

17_ROTHMAN.PDF

18_SENN.PDF

19_STANGL.PDF

20_STARK.PDF

21_ZILIAK.PDF

ASA-Statement-Spanish.pdf

References

ASA Statement on Statistical Significance and P-Values

1. Introduction

2. What is a p-Value?

3. Principles

4. Other Approaches

5. Conclusion

00_GREENLAND-ETAL.PDF

01_ALTMAN.PDF

02_BENJAMIN_BERGER.PDF

03_BENJAMINI.PDF

04_BERRY.PDF

05_CARLIN.PDF

06_COBB.PDF

07_GELMAN.PDF

08_GOODMAN.PDF

09_GREENLAND.PDF

10_IOANNIDIS.PDF

11_JOHNSON.PDF

12_LAVINE.PDF

13_LEW.PDF

14_LITTLE.PDF

15_MAYO.PDF

16_MILLAR.PDF

17_ROTHMAN.PDF

18_SENN.PDF

19_STANGL.PDF

20_STARK.PDF

21_ZILIAK.PDF

ASA-Statement-Spanish.pdf

Acknowledgments

A Brief p-Values and Statistical Significance Reference List

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date