ABSTRACT
Surveys play a key role in researching public perceptions of and attitudes toward science. Accordingly, there is a breadth of often-used survey instruments available which have also been adopted for segmentation analyses. Even though many of these segmentation solutions are similar in their aims, they often include a large numbers of variables, making it more difficult for other researchers to build on these solutions, as survey time is scarce. Therefore, we demonstrate how a large number of variables that were used for a comprehensive segmentation analysis can be reduced considerably without losing too much information. We develop and test a short survey instrument to segment populations according to their attitudes toward science. Results show that segmentation results can be replicated with over 90% accuracy by reducing the instrument from 20 to 10 variables. This reduction does not significantly affect the predictive power of segment attribution on three dependent variables, which suggests that many segmentation analyses could be similarly optimized, helping researchers save survey time and standardize segmentation analyses more.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1 As this format has been criticized (e.g. Pardo & Calvo, Citation2002, Citation2004), the format was adapted to include questions about arts and humanities in addition to the (natural) sciences, to include both textbook and applied scientific knowledge, and to also include a question about the process of science. Easier and more difficult questions (according to the correct number of answers in previous surveys where available) were mixed. And the dichotomous “correct–false” answer format that is often used was switched to a format allowing respondents to indicate the level of certainty in their answers (Pardo & Calvo, Citation2004, 223f.). These changes were crosschecked both with a recoded, traditional “right”/”wrong” scale for all 11 questions and with a recoded version containing only those five questions which were taken verbatim from earlier studies.
2 We ran the analysis with LatentGold 5.1 (Vermunt & Magidson, Citation2016). Five thousand random sets of starting values were entered into the algorithm to ensure validity and robustness of each solution.
3 Each media source was measured via a single survey item. While we had a similar one-item variable available for overall “internet” contact with science, we took advantage of an additional question in which we asked respondents for their online use in more detail. We could build an online media index (α = 0.80) consisting of the use of following sources to get in contact with science and research: online outlets of newspapers and magazines; online archives of television and radio channels; institutional websites (scientific, government, organizations); Facebook; blogs or message boards; Wikipedia; YouTube or similar video platforms.
4 Except for normal distribution of residuals in model one and four, all assumptions of linear regression were met in the other three models. Due to the large sample size, however, the linear models should be robust enough to overcome these slight distortions.
5 We use “significant” as p < .05.