Search in:

Measurement: Interdisciplinary Research and Perspectives Volume 15, 2017 - Issue 2

Submit an article Journal homepage

4,323

Views

CrossRef citations to date

Altmetric

Focus Article

Rethinking Traditional Methods of Survey Validation

Andrew MaulGevirtz Graduate School of Education, University of California, Santa BarbaraCorrespondence[email protected]

Pages 51-69 | Published online: 04 Aug 2017

Cite this article
https://doi.org/10.1080/15366367.2017.1348108
CrossMark

Sample our Education journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/15366367.2017.1348108?needAccess=true

ABSTRACT

It is commonly believed that self-report, survey-based instruments can be used to measure a wide range of psychological attributes, such as self-control, growth mindsets, and grit. Increasingly, such instruments are being used not only for basic research but also for supporting decisions regarding educational policy and accountability. The validity of such instruments is typically investigated using a classic set of methods, including the examination of reliability coefficients, factor or principal components analyses, and correlations between scores on the instrument and other variables. However, these techniques may fall short of providing the kinds of rigorous, potentially falsifying tests of relevant hypotheses commonly expected in scientific research. This point is illustrated via a series of studies in which respondents were presented with survey items deliberately constructed to be uninterpretable, but the application of the aforementioned validation procedures nonetheless returned favorable-appearing results. In part, this disconnect may be traceable to the way in which operationalist modes of thinking in the social sciences have reinforced the perception that attributes do not need to be defined independently of particular sets of testing operations. It is argued that affairs might be improved via greater attention to the manner in which definitions of psychological attributes are articulated and greater openness to treating beliefs about the existence and measurability of psychological attributes as hypotheses rather than assumptions—in other words, as beliefs potentially subject to revision.

KEYWORDS:

noncognitive measurement
philosophy of measurement
survey research
validation
validity

Notes

1. See http://coredistricts.org/social-emotional-learning-efforts/.

2. In all analyses, the negatively worded items were reverse-coded (e.g., with strongly disagree receiving the maximum score rather than the minimum score).

3. Available at http://mindsetonline.com/thebook/buythebook/index.html.

4. All correlations reported in this paper are between raw scores; none have been disattenuated for measurement error. If a disattenuation formula were applied, the correlations reported in these three sections would appear larger.

5. Lorem ipsum text, which is commonly used as placeholder text in publishing and graphic design applications, is itself derived from sections 1.10.32 and 1.10.33 of “de Finibus Bonorum et Malorum” (The Extremes of Good and Evil) by Cicero, written in 45 BC. However, words are scrambled, added, removed, and altered, rendering the final text unintelligible even to someone well-versed in Latin. As a check, all eight items employed in this study were submitted to Google Translate, which failed to return any meaningful translations.

6. Clearly it does happen at least some of the time that these validation procedures do return results that are not entirely positive and that at least some of the time these results are in turn used to improve the quality of the instrument. The argument in this paper does not aim to establish that these procedures are without value nor that they are categorically incapable of returning negative results in the presence of low-quality instrumentation—only that they cannot be relied upon to do so.

7. Seen this way, the fact that the correlation between the two scores is anything less than perfect could be explained by changes over time in real perseverance, method effects, or specific variance attributable to the two different measures or invalidity or unreliability of either measure (or, of course, some combination of these).

8. For a vivid illustration of this point, see Arnulf et al. (Citation2014).

9. For a more thorough review of operationalism than is possible here, see Chang (2009); for general critiques, see Green (Citation1992) and Bickhard (Citation2001); for critiques of operationalism in the social sciences in particular, see Markus and Borsboom (Citation2013) and Michell (Citation1990).

10. There are of course many other definitions of self-control that could have been used here; this one is chosen purely for illustrative reasons.

11. For example, yet another possibility is that “self-control” refers neither to a disposition nor a capacity of persons—perhaps not even an attribute of persons at all—but, rather, is a shorthand way of referring to an inductive summary of a (potentially very large or possibly even infinite) set of (actual and possible) behaviors on the part of the individual. Such an approach effectively circumvents any form of cognitive theory concerning the explanation for (variation in) such behaviors and is thus broadly inconsistent with the postcognitive revolution aims of psychological science, but may still be useful for purely descriptive purposes. Additionally, it would seem strained at best to refer to such an approach as measurement, especially if one understands measurement as a causal relation between (variation in) an attribute and (variation in) the observed outcomes of a procedure, a la Borsboom et al. (Citation2004).

12. Arguably, this is exactly what happened in the literature on emotional intelligence; a considerable amount of energy was spent attempting to differentiate typical-behavior and maximum-performance models of EI (e.g., Maul, Citation2012).

13. One possible explanation for the pervasiveness of this fallacy in the social sciences is that accounts of psychological measurement often analogize from the physical sciences, and specifically from concatenable attributes such as height, in which it is demonstrably the case that the structures of interindividual and of intraindividual differences are identical.

14. Additional studies reported in the same manuscript demonstrate that scores on the grit scale can predict some portion (an average of 4%) of the variance in other outcomes, such as success in a spelling bee competition. It is not clear whether these studies are meant to provide further evidence of the validity of the instrument used to measure grit or to test hypotheses regarding the effects of grit (which would require presupposing that the instrument is validly measuring grit).

15. The belief that measurement is universally necessary is associated with a second, related belief, which is that measurement is universally possible—reflected, for example, in the oft-repeated claims that “whatever exists at all exists in some amount” (Thorndike, Citation1918, p. 16) and “anything that exists in amount can be measured” (McCall, Citation1939, p. 15). This belief is highly debatable, and whether it can be considered credible depends on how one understands the concept of measurement and its requirements (cf., Michell, Citation2005, Markus & Borsboom, Citation2013).

16. Elsewhere, this belief is asserted even more directly as “you cannot study what you cannot measure” (http://angeladuckworth.com/qa/).

Arnulf, J. K., Larsen, K. R., Martinsen, Ø. L., & Bong, C. H. (2014). Predicting survey responses: How and why semantics shape survey statistics on organizational behaviour. PloS One, 9(9), e106361. doi:10.1371/journal.pone.0106361

PubMed Web of Science ®Google Scholar

Green, C. D. (1992). Of immortal mythological beasts: Operationism in psychology. Theory and Psychology, 2(3), 291–320. doi:10.1177/0959354392023003

Web of Science ®Google Scholar

Bickhard, M. H. (2001). The tragedy of operationalism. Theory and Psychology, 11(1), 35–44. doi:10.1177/0959354301111002

Web of Science ®Google Scholar

Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. Routledge.

Google Scholar

Michell, J. (1990). An introduction to the logic of psychological measurement. New York: Psychology Press.

Google Scholar

Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061–1071. doi:10.1037/0033-295X.111.4.1061

PubMed Web of Science ®Google Scholar

Maul, A. (2012). The validity of the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) as a measure of emotional intelligence. Emotion Review, 4, 394–402. doi:10.1177/1754073912445811

Web of Science ®Google Scholar

Thorndike, E.L. (1918). The nature, purposes, and general methods of measurements of educational products. Chapter II in G.M. Whipple (Ed.), The Seventeenth yearbook of the National Society for Study of Education. Part II. The Measurement of Educational Products.. Bloomington, IL: Public School Publishing Co.

Google Scholar

McCall, W.A. (1939). Measurement. New York: The Macmillan Company.

Google Scholar

Michell, J. (2005). The logic of measurement: A realist overview Measurement. 38(4), 285–294

Google Scholar

Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. Routledge.

Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Rethinking Traditional Methods of Survey Validation

Information for

Open access

Opportunities

Help and information

Rethinking Traditional Methods of Survey Validation

ABSTRACT

Notes

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature