Abstract
Researchers rarely mention statistical power in Teaching of Psychology teaching activity studies. Insufficiently powered tests promote uncertainty in the decision to accept or reject the tested null hypothesis and influence the interpretation of results. We analyzed the a priori power of statistical tests from 197 teaching activity effectiveness studies published from 1974 through 2006. We found that two thirds of the studies were powerful enough to detect only large effects. We compared observed sample sizes with expert recommendations and found that studies typically used sample sizes that were too small. We discuss limitations of underpowered statistical tests in evaluating teaching activity effectiveness and make design-related recommendations for improving power.