Abstract
This paper proposes three criteria for evaluating advertising measurement practice: item validity, scale validity, and comparability. A review of the measurement literature is combined with an overview of research practice in order to identify harmful measurement practice. Applying the proposed criteria, the paper identifies four harmful measurement practices that should be eliminated: (1) the use of inappropriate numerical response scales; (2) mixing unipolar and bipolar response scales; (3) the use of antecedent and outcome items; (4) the inconsistent use of response scale endpoint qualifiers. By eliminating these flawed measurement practices the field would significantly improve advertising measurement practice.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes on contributors
Lars Bergkvist is a professor of marketing in the College of Business at Zayed University, Abu Dhabi, United Arab Emirates. His primary research interests are advertising, leveraged marketing communications, and research methodology.
Tobias Langner is a professor of marketing in the Schumpeter School of Business and Economics at the Bergische University Wuppertal, Wuppertal, Germany. His primary research interests are advertising, brand management, and construct measurement.
Correction Statement
This article has been republished with minor changes. These changes do not impact the academic content of the article.
Notes
1 This recommendation is also valid for most polar opposites that do not have a clear positive or negative connotation. For example, scale endpoints such as ‘short/tall’ or ‘quick/slow’ are neither solely negative nor positive. However, in these examples one opposite involves less of the considered property (e.g. height) than the other and thus, it is more natural for people to associate ‘short’ with negative values on an answer scale than ‘tall’. Admittedly, there are few polar opposites (e.g., feminine/masculine) that are not naturally linked to either negative or positive numbers. Then, the method for answer scale development as suggested by Rossiter (Citation2011b, p. 21) should be applied to develop adequate scale labels: For each pair of polar opposites, raters should be asked in open-ended questions which numbers they would put on the answer scale.
2 The examples in this and the following tables have been included as illustrative examples of common measurement practice. The intent is not to single out individual studies and it should be kept in mind that the criticisms apply to the measurement practice in many studies (including studies by the authors).
3 There is a similar but less frequent tendency to use items that are consequences of the target construct. The same arguments that apply to antecedent items apply to consequent items. However, in the interest of brevity the present discussion focuses on antecedent items.
4 In most situations when people are asked for an object evaluation (e.g. when answering a brand attitude question) they will naturally do this assessment along a bipolar ‘bad-good’ axis. However, there may also be rare scenarios in which respondents consider an attitude object as possessing qualities from both polar opposites at the same time. (We are grateful to one of the anonymous reviewers for drawing this to our attention.) This would be the case, for example, when people consider a car as being ugly and beautiful or good and bad at the same time. In these cases, researchers could split the bipolar scale into two scales measuring each polar opposite on a unipolar scale. However, future research is necessary to rule out that this procedure harms item validity.