328
Views
1
CrossRef citations to date
0
Altmetric
Measurement, Statistics, and Research Design

Number of Response Categories and Sample Size Requirements in Polytomous IRT Models

ORCID Icon & ORCID Icon
 

Abstract

Applications of polytomous IRT models in applied fields (e.g., health, education, psychology) are abound. However, little is known about the impact of the number of categories and sample size requirements for precise parameter recovery. In a simulation study, we investigated the impact of the number of response categories and required sample size for successful parameter recovery and also examined model fit. Our results suggested that sample size requirements of approximately 250 or larger are typically sufficient for reasonable recovery of either person or item parameters (in particular discrimination). Further, we found that in many cases, the conditions of 10 items yielded more favorable results than those conditions when only 5 items were considered. With respect to the number of response categories, results were inconsistent, suggesting more complex recommendations in selecting the number of categories when developing scales. We discuss limitations of the current study, identify future directions, and provide recommendations for applied researchers who engage in scale development.

Notes

1 For example, Muis et al. (Citation2009) and Cordier et al. (Citation2019) used sample sizes of 217 and 342, respectively, in the application of the polytomous IRT models, while Cole et al. (Citation2019) used 333.

2 This was particularly true when missing data were present.

3 Developments of FA methods today allow for inclusion and modeling of ordered and categorical variables.

4 This two-step process of estimation is why the GRM is sometimes called indirect model (e.g., Embretson & Reise, Citation2000) or cumulative model (Penfield, Citation2014).

5 Note that “1” here represents P(X 0) as it corresponds to probability of responding to the lowest category or above.

6 Note that “0” here represents P(Xij>3aj bj, θ).

7 We did not examine the item parameter estimations under model-misspecification conditions as they are not directly comparable. Item parameters are modeled differently in GRM (as a cumulative model) and GPCM (as an adjacent category model); see Embretson and Reise (Citation2000) and Penfield (Citation2014) for more details on the differences in polytomous models.

8 These results can be found in in Appendix B.

9 Out of 768 conditions, only 28 had non perfect convergence; the lowest convergence rate was observed in condition with N = 100, K = 3, fitting GRM model based on GRM data, where convergence rate was .91. See Appendix B2 as well as tabulated rates of convergence at https://doi.org/10.6084/m9.figshare.21667547

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.