Abstract
Periodic assessment and scrutiny of the discipline's measurement practices, instruments, and research findings are necessary to provide clarity and direction by revealing what we know, how we know it, and where the knowledge gaps exist. Reflective reviews have produced ample appraisals of the theory, research, and methods employed in the conduct of instructional communication scholarship. The scope of the present effort is twofold and includes both description and critical assessment and evaluation of 21 instruments published post 2004 in journal outlets that traditionally feature instructional communication research. The reliability, validity, and usefulness of each measure are examined, measurement practices are critiqued, and problems are identified. The manuscript concludes by offering several recommendations for measurement in instructional communication research.
Notes
[1] We limited our selection and discussion of measures to those published post 2004 to focus on recently developed scales and also to avoid repetition of prior efforts that reviewed previously developed measures (see Kearney & Beatty, Citation1994; Rubin, Citation2009).
[2] The definitions and conceptualization of reliability and validity referenced in this article are informed by The Standards for Educational and Psychological Testing (Citation2014), prepared by The American Educational Research Association (AERA), the American Psychological Association (APA), and the National Council on Measurement in Education (NCME). Three types of validity are identified and include: content, criterion, and construct validity. Content validity is concerned with representativeness. Scale items are generated to represent the content domain of the construct of interest. Face validity, a subset of content validity, is a starting point for scale development. Face validity relies on common agreement that on its “face” the measure appears to be a good translation of the construct. Criterion-related validity addresses prediction and outcomes, and involves assessing a measure with some external criterion. There are two common forms of criterion-related validity: predictive and concurrent validity. Predictive validity involves the future prediction of an outcome (i.e., criterion). Relatedly, concurrent validity is indicated when the criterion measure is obtained at the same time, i.e., concurrently, as the initial measurement of interest. Construct validity, the most important and recent addition to measurement practice, links theory to measurement (Kerlinger & Lee, Citation2000). Variables are deduced from theory and are tested for expected relationships. If the measures perform in theoretically hypothesized ways, then this constitutes a degree of construct validity and reflects on the theory, the measures constructed, and the method employed (Allen & Yen, Citation1979/2002). Four different forms of construct validity include: convergent validity, discriminant validity, multitrait-multimethod validity, and factorial validity. Convergent validity addresses the degree to which theoretically related measures should be statistically related to each other. Discriminant validity (sometimes referred to as divergent validity) examines the degree to which theoretically unrelated measures should be statistically unrelated. Multitrait-multimethod validity features the examination of unrelated and different traits measured by different methods. The resulting correlation matrix reveals relationships between the variables measured in different ways. Hypothetically, the same trait should produce high correlations, even though it is measured via different methods (i.e., convergent validity). Conversely, correlations between different and unrelated traits, measured via the same methods, should be low (i.e., discriminant validity). Measurement bias is suggested if correlations for different traits are higher when the same method is used than when different methods are employed to measure the same trait (Allen & Yen, Citation1979/2002). Factorial validity is a data-reduction technique that employs factor analysis to reveal interrelationships between and among scale items to produce meaningful and related factors.
[3] Scale items crafted for single-use, without benefit of reliability and validity analysis, are not included in this review.
[4] Teachers’ verbal and written clarity can also be measured using Titsworth et al. (Citation2004) Clarity Behaviors Inventory. Although this measure appeared in a 2004 conference paper, it has gained significant scholarly traction in recent years, and has been used in several studies (see Li, Mazer, & Ju, Citation2011; Mazer, Citation2013a, Citation2013b; Mazer et al., Citation2014; Titsworth et al., Citation2010, Citation2013).