Abstract
Statistical conclusions validity refers to the degree to which we make correct statistical inferences from the analysis of data. The conventional use of null hypothesis significance tests to evaluate scientific hypotheses, coupled with the editorial practices of scholarly journals, lead to probabilistic error rates. The net result is “mixed support” in the literature for most hypotheses, regardless of the merit of the hypothesis. This article outlines the mechanisms at play and offers helpful solutions.
An earlier version of this article was presented at the International Communication Association, 2010, Singapore.
Notes
Any significance test can be wrong in two ways. A Type 1 error is rejecting the null when the null is true (i.e., a result that is significant, but should not be) and a Type 2 error is failing to reject a false null (i.e., a nonsignificant result that should have been significant). Type 1 errors are a function of the alpha level, and Type 2 errors are a function of statistical power. Interested readers are referred to Weber (Citation2007) for guidance on when to correct for Type 1 error; to O'Keefe (Citation2007) for a brief primer on power; and to Smith, Levine, Lachlan, and Fediuk (Citation2002) for a discussion of how both types of error rates propagate over a number of tests.
Power analyses let us determine the probability of a Type 2 error for a given significance test given the sample size, alpha, and some presumed non-zero effect size. The binomial distribution provides the distribution of errors over some number of tests presuming that the tests are independent and that the probability of obtaining a significant result (i.e., power) is constant across tests.