Abstract
The Extended Bootstrap (EB) assessment approach was developed for the examination of relationships of Type I error, power, sample size (n), and effect size (ES) for statistical tests of ecological data. The EB approach was applied to univariate and multivariate statistical analyses of a large data set collected from an ongoing, multiple stressor bioassessment study of watersheds in the Central Valley, San Francisco, and Central Coast areas of California. Benthic metrics were created that either increased or decreased monotonically with stress (toxicants or metrics indicative of habitat quality). Type I errors were stable for all statistical tests that were evaluated. The relationships between n and ES displayed patterns of “diminishing returns” for all statistical tests: i.e. an increasingly larger n was required to detect decreasingly smaller ES. Nonetheless, the n’s collected across the watersheds and within a selected watershed were sufficient to detect even small correlations between representative benthic metrics and potential stressors with high power. The power and robustness of a novel method using EB and previously described statistical techniques designed to address multicollinearity were shown to approach those of simpler univariate regressions. Potential applications of the EB approach for experimental design, data assessment and interpretation, and hypothesis testing are discussed.
Acknowledgements
The authors thank the Pyrethroid Working Group for sponsoring this research. We also thank the California Department of Fish and Wildlife for identification of benthic species and calculating of benthic metrics. The authors also wish to thank Eurofins EAG Agroscience LLC for pyrethroid analysis. We also acknowledge Alpha Analytical Laboratories for TOC, grain size and metals analysis. William Killen and Ronald Anderson from the University of Maryland Wye Research and Education Center are acknowledged for collection of field samples.
Notes
1 It should be noted that X may not have to be bootstrapped for univariate regressions, since the R*Y|X would be randomly paired with X anyway. However, it may be useful in multivariate analysis such as CCA, since the canonical variates are not necessarily orthogonal, so bootstrapping of each independent variable may be desirable to create independence between them in the null hypothesis state. O’Gorman[49] suggested that additional independent variables (Z) beyond the independent variable of interest (X) should be independently shuffled (“Z-shuffle” or “Z-permute”) in resampling for multiple regression analysis, particularly if outliers are involved and weighted regressions are employed in hypothesis testing. Thus, the bootstrapping of independent variables was included in the EB process, since it does little to increase total processing time.
2 All analyses were conducted on a standard “off the shelf” laptop computer, so an EB assessment of CCAs on the large number of biological metrics and environmental variables would produce memory and storage issues. Thus, the data reduction of PCA was used for these demonstration assessments. EB assessments of the CCAs of the original variables may be conducted on computers with greater memory and storage capacities.