270
Views
1
CrossRef citations to date
0
Altmetric
Articles

From Ontological Traits to Validity Challenges in Social Science: The Cases of Economic Experiments and Research Questionnaires

&
Pages 101-127 | Published online: 29 Oct 2019
 

ABSTRACT

This article examines how problems of validity in empirical social research differ from those in natural science. Specifically, we focus on how some ontological peculiarities of the object of study in social science bear on validity requirements. We consider these issues in experimental validity as well as in test validity because, while both fields hold large intellectual traditions, research tests or questionnaires are less closely connected to natural science methodology than experiments.

Acknowledgements

We are thankful to Daniel Kuchler for his very valuable comments on an earlier version of our paper. Thanks also to two anonymous referees of this journal for extremely helpful feedback on our work.

Notes

1 The acknowledgement of this partial independence between the micro and the macro levels underlies Durkheim’s vindication of sociology as an autonomous science, for, as Rosenberg observes (Citation2015, 180–184), his vindication is based on the very postulation of the existence of social facts. In his classic Suicide, Durkheim illustrates this point by noting how the suicide rate rose 100% between 1856 and 1878 while the mix of individual-level causes of suicide, recorded as ‘presumptive causes’ in coroners’ reports, remained almost the same. From this, he argues that the latter cannot be causes of the former, applying the widely shared methodological principle ‘same cause, same effect’. The suicide rise therefore emerges as a social fact that transcends individuals’ reasons, a social fact to be explained by other social facts. However, the extent to which both stable empirical regularities in social facts and their epistemic value can be established remain controversial issues. From a social ontology standpoint, Lawson (Citation2003) contends that patterns in social events tend to be mere demi-regularities or demi-regs, hence encouraging social researchers to go beyond surface event correlations to uncover the underlying causal mechanisms. Even if we recognize that the incidence of the problems we have posed may differ across levels, digging into the relationships between these levels or contributing to the long-lasting ‘individualism versus holism’ debate by far exceeds the scope of this article.

2 Here we speak of ‘naturalism’ in Rosenberg’s sense (Citation2015, 30–31). According to him, ‘naturalists’ are those social scientists who believe that prediction is not at odds with interpretation and that, by adapting the methods from natural sciences, they can find regularities precise enough to predict human action. In contradistinction, it is the ‘antinaturalism’ or ‘interpretative social science’, which assumes that apprehending meaningfulness in human action is hardly compatible with a naturalistic and predictive approach to social science. We understand both economic experiments and research questionnaires to be linked to naturalism in that the former attempt to resemble natural science experiments and the latter heavily relies on standardized measurement of social traits.

3 The intellectual tradition of test validity comes from the fields of psychometrics and education sciences. It has its codified expression in the Standards for Educational and Psychological Testing (AERA, APA, and NCME Citation2014), where a test is defined on page 2 as ‘a device or procedure in which a sample of an examinee’s behaviour in a specific domain is obtained and subsequently evaluated and scored using a standardized process’. Although the focus of this tradition is on educational and psychological tests, there is a long-entrenched relationship between psychometrics and survey research, with a continuous exchange of concepts and methods and a highly similar approach to validity. Moreover, the Standards are often used as a framework to validate survey instruments or questionnaires. Hence, we adopt here test validity as an umbrella term covering also validity assessment in survey research.

4 Although this raises several issues, the emphasis has traditionally been made on achieving enough statistical power to avoid Type-II errors or undetected existing correlations, and enough statistical significance to avoid Type-I errors or apparent correlations that do not exist (García-Pérez Citation2012).

5 For a recent defence of the ‘social sciences as life sciences’ see Duprè (Citation2016).

6 See above the introductory section and note 1 for nuances related to this statement.

7 By ‘high-order beliefs’ we mean beliefs about others’ beliefs, and about others’ beliefs about others’ beliefs, and so on.

8 There are other aspects of the experimenter’s behaviour that may also affect the subjects’ response once the latter have become aware of them. Among these aspects, Wiggins (Citation1968, 400) mentions maturation (experimenter’s fatigue or boredom due to increasing familiarity with the experiment), and change in the degree of confidence in the hypothesis during the experiment as the experimenter analyses data between the beginning and the end of the data collection.

9 They also distinguish three levels of involvement in answering questions: optimizing, weak satisficing, and strong satisficing. Optimizing means thoroughly and unbiasedly performing the cognitive tasks involved in answering questions. Weak satisficing is to offer the first answer that seems acceptable after incompletely and/or biasedly performing such cognitive tasks. Strong satisficing consists of selecting an answer without performing such tasks, and without reference to any respondent-internal cues relevant to the question, either looking for a cue in the question wording or making an arbitrary choice.

10 Since doing without deception may be costly, McKenzie and Wixted (Citation2001, 424) suggest a different way to avoid the misinterpretation of participants’ scepticism induced by their deception fear as evidence of non-normative behaviour. Their proposal consists in deriving and testing normative models that do not assume full belief in key task parameters. This elegant approach, however, is not found completely convincing by Hertwig and Ortmann (Citation2001, 438), who criticize that it introduces a free parameter into the models—increasing the danger of data-fitting—and assume that it will often not be applicable because of an insufficient, case-sensitive understanding of the distrust effects.

11 This problem has also been extensively discussed under the label of ‘framing effects’. For a typology of valence framing effects, see Levin, Schneider, and Gaeth (Citation1998).

12 A convincing argument for the relevance of operational meanings in science can be found in Chang (Citation2004), where he analyses in detail the operational side of the concept of temperature.

13 Tal’s (Citation2016) model-based account of contemporary time measurement provides an interesting framework for analysing how modifications to the way measurement standards are modelled result in changes in the mode of application of a quantity concept.

14 The non-uniform behavioural exercise of the same mental dispositions had been already noticed by Gilbert Ryle in his path-breaking work, The Concept of Mind: ‘Now the higher-grade dispositions of people with which this inquiry is largely concerned are, in general, not single-track dispositions, but dispositions the exercises of which are indefinitely-heterogeneous’ (Ryle [Citation1949] Citation2009, 32).

15 Ultimately, according to Rosenberg, folk psychology may be summarized in the following general statement that he labels [L]: ‘If any person, agent, individual, wants some outcome, d, and believes that an action, a, is a mean to attain d under the circumstances, then x does a.’ He notes, however, that, in order to determine the initial conditions of an action, it is always necessary to take into account more desires and beliefs than the ones referred in [L]. There are many cases in which widely held clusters of desires and beliefs can be confidently attributed to any agent and predictions can be made by applying [L], yet, when this happens, such predictions are usually non-interesting ones. If compared with ideal models within physics, like the ideal gas law, both [L] and the rational-agent model perform poorly in providing surprising as well as highly precise predictions at the level of individuals. Moreover, [L] proves less amenable than most physical models to empirical correction and subsequent refinement by way of improving the measurement of the initial conditions. In measuring initial beliefs and desires, social scientists confront ‘a regress problem’ that is hardly instantiated in natural science: they have to resort to the very model that is being tested, that is, they need to rely on [L] to be able to measure the beliefs and desires involved in the initial conditions.

16 Late logical empiricists like Nagel often referred to the inherited laws of a theory as those that, being mainly about experimental data or experimentally established phenomena, were assumed in the determination of the theory’s empirical content. The different elements involved in the empirical content of a theory, which are closely connected to theories presupposed in experimental practice, have been sorted out in Suppes’s hierarchy of theories, models and problems (Citation1962, 259), where he distinguishes between different levels of models: theoretical, models of experiments, models of data, experimental design and ceteris paribus conditions. A more refined distinction between those elements was later provided by Mayo (Citation1996).

17 As suggested in the previous footnote, the hierarchical aspect that we are emphasizing here is related to the presupposed laws involved in warranting the empirical application of scientific theories or concepts. This emphasis is not meant to imply that there is any ultimate, unifying empirical basis for science, since we are well aware of the dynamical nature of the empirical basis and the fragmentary, disjointed character of many areas of natural science. Our point is rather that the empirical soundness of concepts and theories in natural science has not analogue in some areas of social science, where hierarchies of empirical support in the form of nomological networks can seldom be identified as present to the same degree. If concepts like those of ether and phlogiston have been dismissed is because certain empirical properties (interference with the speed of light and weight, respectively) were unequivocally attributed to them and, in addition to this, the experimental determination of these properties was unequivocal too, partly by virtue of the well-established character of (optical, mechanical) laws presupposed in the use of the corresponding experimental instruments (interferometer, balance).

18 Søberg (Citation2005) and Cordeiro-dos-Santos (Citation2006) independently formulate a set of auxiliary assumptions respectively based on Smith’s (Citation1982) distinction between three ingredients of a lab experiment (environment, institution, design) and his distinction between four precepts in economic experimentation (nonsatiation, saliency, dominance, privacy).

19 A more flexible notion includes both semantic and serial order effects, the latter resulting from the location of a question in a sequence of items—for instance, the fatigue effects of the later items. In this vein, Billiet, Waterplas, and Loosveldt (Citation1992, 131) define context effects as ‘response effects coming from one or more preceding questions (and answers) or from response scales belonging to previous questions’.

20 The same distinction is often established also in terms of ‘Galilean’ versus ‘non-Galilean’ assumptions.

Additional information

Funding

This research was financially supported by the research projects ‘Laws and Models in Physical, Chemical, Biological, and Social Sciences’ [PICT-2018-03454, ANPCyT, Argentina], ‘Stochastic Representations in the Natural Sciences: Conceptual Foundations and Applications (STOCREP)’ [PGC2018-099423-B-I00, Ministerio de Ciencia e Innovación], ‘Political and Economic Consequences of the Decentralization’ [CSO2013-47023-C2-2-R, Ministry of Economy and Competitiveness].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 733.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.