532
Views
4
CrossRef citations to date
0
Altmetric
Articles

Use of Generalizability Theory Within K–12 School-Based Assessment: A Critical Review and Analysis of the Empirical Literature

, &

References

  • American Educational Research Association. (2006). Standards for reporting on empirical social science research in AERA publications: American Educational Research Association. Educational Researcher, 35, 33–40. doi:10.3102/0013189X035006033
  • APA Publications and Communications Board Working Group on Journal Article Reporting Standards (2008). Reporting standards for research in psychology: Why do we need them? What might they be? American Psychologist, 63, 839–851. doi:10.1037/0003-066X.63.9.839
  • *Baker, E. L., Abedi, J., Linn, R. L., & Niemi, D. (1996). Dimensionality and generalizability of domain-independent performance assessments. The Journal of Educational Research, 89, 197–205. doi:10.1080/00220671.1996.9941205
  • *Baxter, G. P., Shavelson, R. J., Herman, S. J., Brown, K. A., & Valdez, J. R. (1993). Mathematics performance assessment: Technical quality and diverse student impact. Journal for Research in Mathematics Education, 24, 190–216. doi:10.2307/749344
  • *Bergeron, R., Floyd, R. G., McCormack, A. C., & Farmer, W. L. (2008). The generalizability of externalizing behavior composites and subscale scores across time, rater, and instrument. School Psychology Review, 37, 91–108.
  • Brennan, R. L. (2000). (Mis) Conception about generalizability theory. Educational Measurement: Issues and Practice, 19, 5–10. doi:10.1111/j.1745-3992.2000.tb00017.x
  • Brennan, R. L. (2001). Generalizability theory. New York, NY: Springer-Verlag.
  • *Briesch, A. M., Chafouleas, S. M., & Riley-Tillman, T. C. (2010). Generalizability and dependability of behavioral assessment methods: A comparison of systematic direct observation and direct behavior rating. School Psychology Review, 3, 408–421.
  • Briesch, A. M., Swaminathan, H., Welsh, M., & Chafouleas, S. M. (2014). Generalizability theory: A practical guide to study design, implementation, and interpretation. Journal of School Psychology, 52, 13–35. doi:10.1016/j.jsp.2013.11.008
  • *Brown-Chidsey, R., Davis, L., & Maya, C. (2003). Sources of variance in curriculum-based measures of silent reading. Psychology in the Schools, 40, 363–377. doi:10.1002/(ISSN)1520-6807
  • *Bruckner, C. T., Yoder, P. J., & McWilliam, R. A. (2006). Generalizability and decision studies: An example using conversational language samples. Journal of Early Intervention, 28, 139–153 doi:10.1177/105381510602800205
  • Cardinet, J., Tourneur, Y., & Allal, L. (1976). The symmetry of generalizability theory: Applications to educational measurement. Journal of Educational Measurement, 13, 119–135. doi:10.1111/j.1745-3984.1976.tb00003.x
  • *Chafouleas, S. M., Briesch, A. M., Riley-Tillman, T. C., Christ, T. J., Black, A., & Kilgus, S. P. (2010). An investigation of the generalizability and dependability of Direct Behavior Rating Single Item Scales (DBR-SIS) to measure academic engagement and disruptive behavior of middle school students. Journal of School Psychology, 48, 219–246. doi:10.1016/j.jsp.2010.02.001
  • *Chafouleas, S. M., Christ, T. J., Riley-Tillman, T. C., Briesch, A. M., & Chanese, J. A. M. (2007). Generalizability and dependability of direct behavior ratings to assess social behavior of preschoolers. School Psychology Review, 35, 63–79.
  • Chafouleas, S. M., Volpe, R. J., Gresham, F. M., & Cook, C. (2010). School-based behavioral assessment within problem-solving models: Current status and future directions. School Psychology Review, 34, 343–349.
  • *Christ, T. C., & Ardoin, S. P. (2009). Curriculum-based measurement of oral reading: Passage equivalence and probe-set development. Journal of School Psychology, 47, 55–75. doi:10.1016/j.jsp.2008.09.004
  • *Christ, T. J., Johnson-Gros, K. N., & Hintze, J. M. (2005). An examination of alternate assessment durations when assessing multiple-skill computational fluency: The generalizability and dependability of curriculum-based outcomes within the context of educational decisions. Psychology in the Schools, 42, 615–622. doi:10.1002/(ISSN)1520-6807
  • *Christ, T. J., & Vining, O. (2006). Curriculum-based measurement procedures to develop multiple-skill mathematics computation probes: Evaluation of random and stratified stimulus-set arrangements. School Psychology Review, 35, 387–400.
  • *Conger, A. J., Conger, J. C., Wallander, J., Ward, D., & Dygdon, J. (1983). A generalizability study of the Conners’ teacher rating scale-revised. Educational and Psychological Measurement, 43, 1019–1031. doi:10.1177/001316448304300410
  • Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York, NY: Wiley.
  • Cronbach, L. J., Rajaratnam, N., & Gleser, G. C. (1963). Theory of generalizability: A liberalization of reliability theory. British Journal of Mathematical and Statistical Psychology, 16, 137–163. doi:10.1111/j.2044-8317.1963.tb00206.x
  • *Crowley, S. L., Thompson, B., & Worchel, F. (1994). Validity studies the children’s depression inventory: A comparison of generalizability and classical test theory analyses. Educational and Psychological Measurement, 54, 705–713. doi:10.1177/0013164494054003017
  • Dedrick, R. F., Ferron, J. M., Hess, M. R., Hogarty, K. Y., Kromrey, J. D., Lang, T. R., … Lee, R. S. (2009). Multilevel modeling: A review of methodological issues and applications. Review of Educational Research, 79, 69–102. doi:10.3102/0034654308325581
  • *Fawson, P. C., Reutzel, D. R., Smith, J. A., Ludlow, B. C., & Sudweeks, R. (2006). Examining the reliability of running records: Attaining generalizable results. The Journal of Educational Research, 100, 113–126. doi:10.3200/JOER.100.2.113-126
  • *Gao, X., Shavelson, R. J., & Baxter, G. P. (1994). Generalizability of large-scale performance assessments in science: Promises and problems. Applied Measurement in Education, 7, 323–342. doi:10.1207/s15324818ame0704_4
  • *Gierl, M. (1998). Generalizability of written-response scores for the Alberta education English 30 diploma examination. The Alberta Journal of Educational Research, 64, 94–97.
  • Hartmann, D. P. (1982). Assessing the dependability of observational data. In D. P. Hartmann (Ed.), Using observers to study behavior (pp. 51–65). San Francisco, CA: Jossey-Bass.
  • Hendrickson, A., & Yin, P. (2010). Generalizability theory. In G. R. Hancock & R. O. Mueller (Eds.), The reviewer’s guide to quantitative methods in the social sciences (pp. 115–122). New York, NY: Routledge.
  • *Hintze, J. M., Christ, T. J., & Keller, L. A. (2002). The generalizability of CBM survey-level mathematics assessments: Just how many samples do we need? School Psychology Review, 31, 514–528.
  • *Hintze, J. M., & Matthews, W. J. (2004). The generalizability of systematic direct observations across time and setting: A preliminary investigation of the psychometrics of behavioral observation. School Psychology Review, 33, 258–270.
  • *Hintze, J. M., Owen, S. V., Shapiro, S. V., & Daly, E. J. (2000). Generalizability of oral reading fluency measures: Application of G theory to curriculum-based measurement. School Psychology Quarterly, 15, 52–68. doi:10.1037/h0088778
  • *Hintze, J. M., & Pelle Pettite, H. A. (2001). The generalizability of CBM oral reading fluency measures across general and special education. Journal of Psychoeducational Assessment, 19, 158–170. doi:10.1177/073428290101900205
  • *Huang, J. (2008). How accurate are ESL students’ holistic writing scores on large-scale assessments? A generalizability theory approach. Assessing Writing, 13, 201–218. doi:10.1016/j.asw.2008.10.002
  • *Kan, A. (2007). Effects of using a scoring guide on essay scores: Generalizability theory. Perceptual and Motor Skills, 105, 891–905.
  • Kane, M. (2002). Inferences about variance components and reliability-generalizability coefficients in the absence of random sampling. Journal of Educational Measurement, 39, 165–181. doi:10.1111/jedm.2002.39.issue-2
  • *Klein, S. P., McCaffrey, D., Stecker, B., & Koretz, D. (1995). The realiability of mathematics portfolio scores: Lessons from the vermont experience. Applied Measurement in Education, 8, 243–260. doi:10.1207/s15324818ame0803_4
  • *Lane, S., Liu, M., Ankenmann, R. D., & Stone, C. A. (1996). Generalizability and validity of a mathematics performance assessment. Journal of Educational Measurement, 33, 71–92. doi:10.1111/j.1745-3984.1996.tb00480.x
  • *Lane, S., & Sabers, D. (1989). Use of generalizability theory for estimating the dependability of a scoring system for sample essays. Applied Measurement in Education, 2, 195–205. doi:10.1207/s15324818ame0203_1
  • *Lee, G. (2002). The influence of several factors on reliability for complex reading comprehension tests. Journal of Educational Measurement, 39, 149–164. doi:10.1111/jedm.2002.39.issue-2
  • *Lomax, R. G. (1982). An application of generalizability theory to observational research. The Journal of Experimental Education, 51, 22–30. doi:10.1080/00220973.1982.11011835
  • Marcoulides, G. A. (1989). The application of generalizability analysis to observational studies. Quality & Quantity, 23, 115–127. doi:10.1007/BF00151898
  • Marcoulides, G. A. (1999). Generalizability theory: Picking up where Rasch IRT leaves off? In S. Embrenson & S. L. Hershberger (Eds.), The new rules of measurement: What every psychologist and educator should know (pp. 129–152). Mahwah, NJ: Erlbaum.
  • *Martinez, J. F., Goldschmidt, P., Niemi, D., Baker, E. L., & Sylvester, R. M. (2007). Language arts performance assignments: Generalizability studies of local and central ratings. Educational Assessment, 12, 267–282.
  • *Marzano, R. J. (2002). A comparison of selected methods of scoring classroom assessments. Applied Measurement in Education, 15, 249–268. doi:10.1207/S15324818AME1503_2
  • *Mastergeorge, A. M., & Martinez, J. F. (2010). Rating performance assessments of students with disabilities: A study of reliability and bias. Journal of Psychoeducational Assessment, 28, 536–550. doi:10.1177/0734282909351022
  • *McBee, M. M., & Barnes, L. L. B. (1998). The generalizability of a performance assessment measuring achievement in eight-grade mathematics. Applied Measurement in Education, 11, 179–194. doi:10.1207/s15324818ame1102_4
  • *McWilliam, R. A., & Ware, W. B. (1994). The reliability of observations of young children’s engagement: An application of generalizability theory. Journal of Early Intervention, 18, 34–47. doi:10.1177/105381519401800104
  • *Newton, X. A. (2010). Developing indicators of classroom practice to evaluate the impact of district mathematics reform initiative: A generalizability analysis. Studies in Educational Evaluation, 36, 1–13. doi:10.1016/j.stueduc.2010.10.002
  • *Novak, J. R., Herman, J. L., & Gearhart, M. (1996). Establishing validity for performance-based assessments: An illustration for collections of student writing. The Journal of Educational Research, 89, 220–233. doi:10.1080/00220671.1996.9941207
  • *Nussbaum, A. (1984). Multivariate generalizability theory in educational measurement: An empirical study. Applied Psychological Measurement, 8, 219–230. doi:10.1177/014662168400800211
  • Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74, 525–556. doi:10.3102/00346543074004525
  • *Poncy, B. C., Skinner, C. H., & Axtell, P. K. (2005). An investigation of the reliability and standard error of measurement of words read correctly per minute using curriculum-based measurement. Journal of Psychoeducational Assessment, 23, 326–338. doi:10.1177/073428290502300403
  • *Ruiz-Primo, M. A., Baxter, G. P., & Shavelson, R. J. (1993). On the stability of performance assessments. Journal of Educational Measurement, 30, 41–53. doi:10.1111/jedm.1993.30.issue-1
  • Schreiber, J. B., Nora, A., Stage, F. K., Barlow, E. A., & King, J. (2006). Reporting structural equation modeling and confirmatory factor analysis results: A review. The Journal of Educational Research, 99, 323–338. doi:10.3200/JOER.99.6.323-338
  • *Shavelson, R. J., Baxter, G. P., & Gao, X. (1993). Sampling variability of performance assessments. Journal of Educational Measurement, 30, 215–232. doi:10.1111/jedm.1993.30.issue-3
  • Shavelson, R. J., Webb, N. M., & Rowley, G. L. (1989). Generalizability theory: New developments and novel applications. American Psychologist, 44, 922–932. doi:10.1037/0003-066X.44.6.922
  • Smith, P. L. (1981). Gaining accuracy in generalizability theory: Using multiple designs. Journal of Educational Measurement, 18, 147–154. doi:10.1111/jedm.1981.18.issue-3
  • *Solano-Flores, G., & Li, M. (2006). The use of generalizability theory in the testing of linguistic minorities. Educational Measurement: Issues and Practice, 25, 13–22. doi:10.1111/j.1745-3992.2006.00048.x
  • *Suen, H. K., Lu, C., Neisworth, J. T., & Bagnato, S. J. (1993). Measurement of team decision-making through generalizability theory. Journal of Psychoeducational Assessment, 11, 120–132. doi:10.1177/073428299301100202
  • *Sung, Y. T., Chang, K. E., Chang, T. H., & Yu, W. C. (2010). How many heads are better than one? The reliability and validity of teenagers’ self and peer assessments. Journal of Adolescence, 33, 135–145. doi:10.1016/j.adolescence.2009.04.004
  • *Swartz, C. W., Hooper, S. R., Montgomery, J. W., Wakely, M. B., De Kruif, R. E. L., Reed, M., … White, K. P. (1999). Using generalizability theory to estimate the reliability of writing scores derived from holistic and analytical scoring methods. Educational and Psychological Measurement, 59, 492–506. doi:10.1177/00131649921970008
  • *Tindal, G., Yovanoff, P., & Geller, J. P. (2010). Generalizability theory applied to reading assessments for students with significant cognitive disabilities. The Journal of Special Education, 44, 3–17. doi:10.1177/0022466908323008
  • Volpe, R. J., Briesch, A. M., & Gadow, K. D. (2011). The efficiency of behavior rating scales to assess inattentive-overactive and oppositional-defiant behaviors: Applying generalizability theory to streamline assessment. Journal of School Psychology, 49, 131–155. doi:10.1016/j.jsp.2010.09.005
  • *Volpe, R. J., McConaughy, S. H., & Hintze, J. M. (2009). Generalizability of classroom behavior problem and on-task scores from the direct observation form. School Psychology Review, 38, 382–401.
  • *Webb, N. M., Schlackman, J., & Sugrue, B. (2000). The dependability and interchangeability of assessment methods in science. Applied Measurement in Education, 13, 277–301. doi:10.1207/S15324818AME1303_4

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.