403
Views
18
CrossRef citations to date
0
Altmetric
Original Articles

Sample Size Planning for the Squared Multiple Correlation Coefficient: Accuracy in Parameter Estimation via Narrow Confidence Intervals

Pages 524-555 | Published online: 19 Dec 2008
 

Abstract

Methods of sample size planning are developed from the accuracy in parameter approach in the multiple regression context in order to obtain a sufficiently narrow confidence interval for the population squared multiple correlation coefficient when regressors are random. Approximate and exact methods are developed that provide necessary sample size so that the expected width of the confidence interval will be sufficiently narrow. Modifications of these methods are then developed so that necessary sample size will lead to sufficiently narrow confidence intervals with no less than some desired degree of assurance. Computer routines have been developed and are included within the MBESS R package so that the methods discussed in the article can be implemented. The methods and computer routines are demonstrated using an empirical example linking innovation in the health services industry with previous innovation, personality factors, and group climate characteristics.

Notes

1 The use of the term “accuracy” in this context is the same as that used by CitationNeyman (1937) in his seminal work on the theory of confidence interval construction: “The accuracy of estimation corresponding to a fixed value of 1−α may be measured by the length of the confidence interval” (p. 358; notation changed to reflect current usage).

2 The term “regressors” is used as a generic term to denote the K X variables. In other contexts the regressors are termed independent, explanatory, predictor, or concomitant variables. The term “criterion” is used as a generic term for the variable that is modeled as a function of the K regressors. In other contexts, the criterion variable is termed dependent, outcome, or predicted variable.

3 Tables of sample size comparisons between the AIPE and CitationAlgina and Olejnik (2000) methods have developed and are available from Ken Kelley.

4 In the work of CitationKelley and Rausch (2006), where the AIPE approach was developed for the standardized mean difference, the population value of the standardized mean difference was used throughout the sample size procedures rather than the expected value of the sample standardized mean difference even though the commonly used estimate is biased. This contrasted to the approach here where the E[R 2] is used in place of P 2. As noted in CitationKelley & Rausch (2006, footnote 13), for even relatively small sample sizes the bias in the ordinary estimate of the standardized mean difference is minimal and essentially leads to no differences in planned sample size except in unrealistic situations (see also CitationHedges & Olkin, 1985, chap. 5). However, the bias between R 2 and P 2 can be large, relatively speaking, which would lead to differences in necessary sample sizes if the procedure was based on P 2 directly, as confidence intervals are based on the positively biased value of R 2.

5 Actually, it is desirable to first use a much smaller number of replications (e.g., 1,000) to home in on necessary sample size. After an approximate sample size is determined in the manner discussed, then a large number of replications (e.g., 10,000) is used to find the exact value.

Similar issues also arise and have been reported in the context of AIPE for the standardized mean difference, where small increase in N can have a large impact on γ E (elaboration on this issue is given in CitationKelley & Rausch, 2006). Although not often discussed, similar issues as discussed here occur in power analysis, where increasing sample size by whole numbers almost always leads to an empirical power greater than the desired power.

7 R and MBESS are both available from the Comprehensive R Archival Network (CRAN) at www.cran.r-project.com and http://cran.r-project.org/src/contrib/Descriptions/MBESS.html, respectively. On Macintosh and Windows systems, MBESS can be installed from within R directly using the Package Installation feature, which connects to the CRAN where the software is housed. Source code for Macintosh, Windows, Linux/Unix is available on CRAN.

8 A sensitivity analysis is often beneficial, where a variety of values for parameters are used to assess the effect of misspecifying the parameters on the desired outcome (in this case the confidence interval width). The function ss.aipe.R2.sensitivity() from the MBESS R package can be used to assess the effects of misspecified parameter values in a variety of situations.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 352.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.