Publication Cover
Child Neuropsychology
A Journal on Normal and Abnormal Development in Childhood and Adolescence
Latest Articles
282
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Generalizability of the Swedish WISC-V to the Finland-Swedish minority – the FinSwed study

ORCID Icon, & ORCID Icon
Received 28 Apr 2023, Accepted 11 Mar 2024, Published online: 25 Mar 2024

ABSTRACT

International guidelines highlight the importance of using appropriate and culturally fair test materials when conducting clinical psychological assessments. In the present study, the generalizability of the Swedish WISC-V with Scandinavian normative data was explored in 6–16-year-old Swedish-speaking children in Finland (N = 134), as no local test versions or norms are available for this minority. First, metric measurement invariance was established, i.e., the constructs measured were equivalent between the standardization data and the present sample. Second, the performance of this minority group on the Swedish WISC-V was compared to the Scandinavian normative mean. The findings showed that the Finland-Swedish children performed overall higher than the normative mean on the Swedish WISC-V, with an FSIQ of 103. The performance was significantly higher also in the indexes VSI, FRI, and WMI as well as in several subtests. However, in the subtest Vocabulary, the Finland-Swedish children achieved significantly lower scores than the Scandinavian mean. Further analyses showed significant associations between cognitive performance and age as well as parental education. For the VCI and the FSIQ, performance increased significantly with age, despite the use of age-standardized scaled scores. The general high performance was suggested to relate to the overall high educational level of the Finland-Swedes as well as to other cultural and test-related factors. The results have implications for clinicians conducting assessments with this minority, but also highlight the importance of establishing test fairness by validating tests when used in different cultural groups.

Cognitive tests are often used in psychological clinical settings as a tool to answer diagnostical questions, determine eligibility for educational support, or more generally to assess the strengths and difficulties of individuals (e.g., Yeates & Donders, Citation2005). The Wechsler Intelligence Scale for Children (WISC) is among the most widely used tests in many countries for the assessment of general cognitive ability in school-aged children (e.g., Benson et al., Citation2019; Evers et al., Citation2012; Rabin et al., Citation2016). The newest version of the test, WISC-V (Wechsler, Citation2014a) has been translated and adapted into many languages (van de Vijver et al., Citation2019).

Psychologists are recommended to consider the appropriateness, fairness, and validity of a test and its norms during assessments when assessing individuals of different cultural backgrounds (American Educational Research Association et al., Citation2014; American Psychological Association, Citation2017; International Test Commission, Citation2001, Citation2013). However, test versions and normative data are not available for all cultural groups. According to a survey conducted in 82 countries, it is common for psychologists to use tests developed in a different country than of the person being assessed (Oakland & Hu, Citation1991). This is also the case among the cultural minority group of Swedish-speaking children in Finland (Finland-Swedes). The Wechsler tests have not been standardized for this population. Instead, the Swedish test version with Scandinavian norms collected in Sweden, Denmark, and Norway is frequently used (Rosenqvist et al., Citation2022). This lack of standardized tests is understandable for practical and economic reasons, but problematic from a clinical perspective (see, e.g., Byrne, Citation2016; McGill et al., Citation2020). In the present study, we investigated the generalizability of the Swedish WISC-V to the Finland-Swedish minority.

International comparisons of WISC performance

Along with the standardization of the different versions of the WISC, cross-country comparisons of the normative data have been conducted, showing some, albeit small, differences between Western countries. Regarding the newest test version, the WISC-V, standardization data from Australia/New Zealand, English-speaking Canada, French-speaking Canada, France, Germany, Dutch-speaking regions (the Netherlands and Flanders), Spain, Scandinavia (Sweden, Denmark, Norway), the U.K., and the U.S. were compared and presented in the book “WISC-V: Clinical Use and Interpretation” (Weiss, Saklofske, et al., Citation2019). Some differences between the standardization data from these Western countries/areas in subtests and indexes were reported, but the effect sizes were negligible or small (van de Vijver et al., Citation2019). Similar findings were reported for the older test version WISC-III, when comparing an even larger number of countries (Georgas et al., Citation2003).

Several cross-cultural studies have compared the performance of a specific population to norms collected in a different culture. For instance, when the Canadian standardization sample of the WISC-V was scored using U.S. norms, significant differences emerged in the Full Scale IQ (FSIQ), as well as on several subtests and indexes (Babcock et al., Citation2018). Most of the significant differences were somewhat higher for Canadian than U.S. children, with the exception of the Arithmetic subtest. The FSIQ was 101 in the Canadian sample. However, the effect sizes were overall small and the differences generally decreased when the samples were matched for ethnicity, parental education level, sex, and age (Babcock et al., Citation2018). Similarly, when scored using U.S. norms, Icelandic and Australian children obtained higher scores than the U.S. norms on the WISC-III, the FSIQ being 105 and 101–106 (for different age groups), respectively (Gudmundsson et al., Citation2005–2006; Kamieniecki & Lynd-Stevenson, Citation2002). However, the same trend was not seen in New Zealand (Rodriguez et al., Citation1998). Further, Roivainen (Citation2019) presented a qualitative comparison where raw scores from European standardization samples on different versions of the WISC (Finnish and Norwegian WISC-R; Swedish, German, and French WISC-III) on the subtests Block Design, Coding, and Digit Span were scored using U.S. norms. He reported that the European samples scored higher in the Block Design subtest, and the U.S. samples scored higher than most European samples on Digit Span, while the difference in Coding was smaller (see in Roivainen, Citation2019).

Table 1. Descriptive information of the participants.

Some differences have also been reported between European samples. When comparing children from Germany and Switzerland, derived from the standardization sample of the German WISC-IV and matched on age, sex, and parental education, the only significant difference was a higher score of the Swiss sample in the subtest Block Design and the corresponding Perceptual Reasoning Index (the index score being 105 for the Swiss children and 101 for the German children; Grob et al., Citation2008). In another study, the FSIQ of French-speaking Swiss children was on average 105 on the French WISC-IV (Reverte et al., Citation2015).

In all, some differences in performance have been shown on different versions of the WISC, also between western countries and when children from neighboring countries with similar languages have been compared. However, the subtests and indexes affected have differed between studies. This plausibly reflects the fact that studies have been conducted in different settings, using different methodologies, with children from different countries, of different ages, and speaking different languages. While broad conclusions can be drawn from the previous studies, such as the general need for culture- or country-specific norms, the implications of the previous studies for assessments with other cultural groups are few.

Cognitive assessments in Swedish-speaking Finland

Finland is an officially and constitutionally bilingual (Finnish and Swedish) country, with the Swedish-speaking minority constituting approximately 5.2% of the population (Official Statistics of Finland, Citation2021). There are Swedish-speaking day care centers, schools, and university programs, as well as media (newspapers, radio channels, and TV-programs) and organized hobbies. Health care and legal aid may also be received in Swedish. The Finland-Swedes have a generally high educational level (Official Statistics of Finland, Citation2017; Saarela, Citation2021). While there are no official statistics regarding the number of bilingual individuals in Finland, approximately 40% of the Swedish-speakers are estimated to be Swedish-Finnish bilingual (Saarela, Citation2021). The main form of bilingualism is simultaneous bilingualism, where both languages are spoken at home from birth. However, while children attend school either in Finnish or Swedish, they study the other national language in school. The children attending school in Swedish also study according to the Finnish curriculum. The Swedish spoken in Finland differs to some extent from the Swedish spoken in Sweden (e.g., Sjöholm, Citation2004), for example regarding vocabulary and pronunciation. No test versions of the Wechsler tests have been developed for the cultural minority group of Finland-Swedes.

A study conducted in the Finland-Swedish context showed that when using language subtests from a translated version of the Finnish WISC-IV with Finland-Swedish children, the level of difficulty and the order of the items in the verbal subtests were not correct for this sample (Delatte, Citation2015). The 10–11-year-old Swedish-speaking children scored 89 on the Verbal Comprehension Index (VCI; Delatte, Citation2015). Recently, it was shown within The FinSwed Study that the performance of 6–7-year-olds on WISC-V was close to the Scandinavian normative mean (FSIQ 100), but this was not statistically investigated within the scope of the article (Salonen et al., Citation2023a). As such, no statistically driven information to date exists regarding Finland-Swedish children’s performance compared to the Scandinavian norms of the Swedish WISC-V.

Cognitive performance and background factors

Several background factors may relate to the Finland-Swedish children’s performance on cognitive tests. A well-established background factor relating to cognition is socioeconomic status (SES), often measured as parental education level and/or parental income. In the WISC-V standardization data from the U.S., higher test scores were obtained by children of more highly educated parents (Weiss, Locke, et al., Citation2019). In fact, about 22% of the variance in FSIQ was explained by a combination of parental education level and income (Weiss, Locke, et al., Citation2019), and a similar finding was previously shown for the U.S. standardization data of the WISC-IV (Weiss et al., Citation2006). The relationship between SES and test scores has also been shown with European samples and different versions of the WISC (Cianci et al., Citation2013; Eilertsen et al., Citation2016; Gienger et al., Citation2008; Hernández et al., Citation2017). Further, SES is also an explaining factor regarding cultural test performance differences, for instance in the study by Babcock et al. (Citation2018), where performance differences between Canadian and U.S. children’s WISC-V were found to generally decrease when the groups were matched for parental education level and other demographic variables. Thus, it is important to consider SES when studying cognitive differences between cultural groups.

While bilingualism has been extensively studied in an international perspective, many previous studies have focused on sequential bilinguals or children below school age. Regarding simultaneous bilingualism (where children are exposed to and use both languages from infancy), previous studies conducted in the unique Finland-Swedish context have indicated a monolingual advantage on some tasks requiring expressive language in younger children (Korkman et al., Citation2012; Korpinen et al., Citation2023; Westman et al., Citation2008). Recently, when investigating rapid automatized naming in Finland-Swedish 7-year-olds, monolinguals were significantly better at naming objects and numbers compared to bilinguals, but no advantage was found for rapid naming of letters, nor for phonological skills or reading and writing tasks (Vataja et al., Citation2022). Similarly, no differences in language functions were found on the WISC-IV between Finland-Swedish school-aged monolinguals and simultaneous bilinguals (Karlsson et al., Citation2015). Still, in Finland-Swedish young adults, there is some evidence of a monolingual advantage in more complex language tasks (Leinonen & Tandefelt, Citation2007). In all, there seems to be some indication of a linguistic monolingual advantage, when the bilingual’s language skills are assessed in one language, but previous findings in the Finland-Swedish context are mixed and findings concerning school-aged children scarce.

Also other societal factors, such as differences in the educational systems between countries, can relate to cognition. Associations between performance on intelligence tests and student assessment tests in school subjects have been shown to be high (Rindermann, Citation2007). These relationships have also been shown for the WISC-V (Wechsler et al., Citation2014). One way of comparing educational achievement across countries is by looking at the results from the Programme for International Student Assessment (PISA; OECD, Citation2019). The PISA assessments are conducted by the Organization for Economic and Cultural Development every three years by assessing 15-year-old children from numerous countries worldwide in reading, mathematics, and science (Leino et al., Citation2019). The PISA-results from Finland are reported separately for Finnish- and Swedish-speaking schools. , constructed based on information from the Finnish PISA 2018 report (Leino et al., Citation2019), shows the mean test results presented for Finland and the countries included in the Scandinavian normative data of the WISC-V. Overall, the Finland-Swedish schools produced somewhat higher results than Swedish, Norwegian, and Danish schools in mathematics and science, but not in reading, where the result was similar. Significant differences are shown in . Generally, Finnish, as well as the Scandinavian children performed mostly higher than the mean of the OECD countries. The first results of the PISA 2022, published in December 2023, indicated fairly similar findings as the 2018 report when it comes to the relationships between Finland-Swedish schools and the Scandinavian countries (Hiltunen et al., Citation2023; Swedish National Agency for Education, Citation2023).

Figure 1. PISA 2018 average scores in reading, mathematics, and science for children from Finland, Sweden, Denmark, and Norway.

The scores are derived from the PISA 2018 as reported by the Finnish Ministry of Education and Culture (Leino et al., Citation2019). The results from Finland are presented separately for children from Finnish and Swedish speaking schools. The means of the OECD countries were M = 487 in Reading, M = 489 in Mathematics, and M = 489 in Science.
*Denotes significant differences as compared to the Swedish speaking schools in Finland (Finland Swedish).
Figure 1. PISA 2018 average scores in reading, mathematics, and science for children from Finland, Sweden, Denmark, and Norway.

Summary and aims

The view that clinical psychological assessments of individuals should be conducted with a test developed and normed in the cultural group of the individual is well established (American Educational Research Association et al., Citation2014; American Psychological Association, Citation2017; Gudmundsson, Citation2009; International Test Commission, Citation2001, Citation2013). However, for some cultural groups, no test versions are available (see also, Oakland & Hu, Citation1991) and it is not always realistic to adapt tests and collect norms for minority groups, as test development is an extensive and costly process. Previous cultural studies using different versions of the commonly used cognitive test WISC have differed in methodology and somewhat in results, and the findings are difficult to generalize. Further, few studies have, to date, conducted cross-cultural comparisons using the newest test version, the WISC-V. Overall, cross-cultural studies with the WISC often seem to generate at least some differences. This highlights the importance of validating test versions in cultural groups not included in the normative data, in order to establish test fairness when test development or adaptations are not available for the group.

This study constitutes a part of the research project The FinSwed Study – The Finland-Swedish Minority Study of Cognition in Children. The aim of the larger project is to investigate the generalizability of the Swedish versions of WISC-V and WPPSI-IV (Wechsler, Citation2014b), as well as parts of the neuropsychological test NEPSY-II (Korkman et al., Citation2011) when assessing 5–16-year-old Swedish-speaking children in Finland. The present study focuses specifically on the Swedish WISC-V.

The main aim of the current study was to explore how the minority group of Swedish-speaking Finns performs on the Swedish WISC-V as compared to the Scandinavian normative mean. Previous studies investigating cultural differences in cognitive performance of minority groups matched on language are scarce. Hence, this study was overall exploratory. However, as the general educational level of the Finland-Swedes is somewhat higher than the educational level in the Scandinavian countries (Eurostat, Citation2022; Official Statistics of Finland, Citation2017), we hypothesized that this would be reflected in a high performance on the Swedish WISC-V. Further, there is a well-known association between cognitive performance and academic achievement (e.g., Rindermann, Citation2007; Weiss, Locke, et al., Citation2019). Since the Finland-Swedes have performed fairly strongly in Mathematics and especially in Science in the PISA assessments (Hiltunen et al., Citation2023; Leino et al., Citation2019; see also ), we hypothesized that the performance of the Finland-Swedish children would be higher than the Scandinavian norms especially in tasks measuring mathematical and abstract reasoning skills. Still, we also considered it possible that linguistic tasks would behave differently in the comparison, as the Swedish language used in Finland differs to some extent from the language used in Sweden (e.g., Sjöholm, Citation2004) and, hence, in the standardization of the test. Accordingly, we also explored the need for minor cultural adaptations, when using a test in its non-original country.

Materials and methods

Participants

The 134 participants were 6 years 1 month–16 years 4 months old typically developing Swedish-speaking children from Finland. These were drawn from The FinSwed Study data. In , background information is presented for the study sample as well as for subgroups divided by school grade. There were no significant differences between the age groups regarding background variables (see ).

Only children who were monolingual Swedish or bilingual Swedish and Finnish speakers and attended a Swedish-speaking school or pre-primary school were included in the study. The children were randomly selected from schools and day care centers, located in 15 municipalities/cities in Swedish-speaking areas of Finland. Children were excluded from the study based on exclusion criteria constructed to match the criteria of the Scandinavian standardization of the WISC-V. The exclusion process of The FinSwed Study is described in .

Figure 2. The inclusion and exclusion process for The FinSwed Study sample of 5–16-year-olds.

aOther exclusion criteria were: The child attended a special school or class; had started school a year early or late or had repeated a grade; had been assessed with a Wechsler test during the last six months or was waiting for an assessment; the child was being assessed or treated at a specialist care unit (e.g., neurology, psychiatry, or phoniatrics); had a neurological diagnosis or illness, or a developmental, psychiatric, or learning-related diagnosis; was born prematurely (birth weight <2500 g or before gestational week 37); had a non-corrected visual or hearing impairment; or had medications for psychiatric or neurological reasons.
Figure 2. The inclusion and exclusion process for The FinSwed Study sample of 5–16-year-olds.

For this substudy specifically, and as shown in , we further excluded 63 children who had at least one parent with a university Master’s degree or higher. This was done due to an overrepresentation of families in the highest education group, in order for the educational level of the sample to match the population census for 30–50-year-old Finland-Swedes (Official Statistics of Finland, Citation2017). Thus, children with at least one parent in the highest education group were randomly excluded, while also controlling for other background variables (age, sex, and bilingualism) until the correct percentage was met. The sex of the children was as evenly distributed as possible throughout the data. Bilingualism was kept at 44.0%, which was the amount of bilinguals in the complete randomly collected FinSwed Study data (N = 276). The parental education level of the final sample was representative of the educational level in the census (N = 268 in the final sample; Low level 45.5% in the present data vs. 46.2% in the census, Medium level 31.3% vs. 28.9%, and High level 22.0% vs. 24.9%; χ2 (2) = 1.46, p = .486). Further, it should be noted that also parents with only a nine-year compulsory school primary education were included in the low education level group (1.5% of mothers and 8.3% of fathers; total 4.9%, N = 268). In Finland, official statistics on individuals with only a primary school education is lacking. The lowest education group of registered Swedish-speaking individuals in the census (7.8%) includes both persons with only a primary education as well as individuals with an unknown educational level (including individuals that have received an education abroad). For this reason, the present lowest education group cannot reliably be compared to census. Finally, there were no significant differences in subtests, indexes, or the FSIQ between the children included in the stratified data and the children excluded from the data, when comparing children from the high parental education group (p = .342–978).

Materials

WISC-V

All participants were assessed with the Swedish version of the newest Wechsler test for school-aged children, the WISC-V (Wechsler, Citation2016). In the present study, 13 subtests were administered: All primary subtests, as well as all secondary verbal subtests. The included subtests can be found in . All children were assessed in Swedish and their performance was scored using the Scandinavian norms of the Swedish test.

Table 2. Means, standard deviations, ranges, 95% confidence intervals, and comparison to the Scandinavian normative mean for The FinSwed Study sample (N = 134) on the WISC-V.

The Swedish version of the WISC-V has Scandinavian normative data collected in Sweden, Norway, and Denmark during 2015–2016 (Wechsler, Citation2016). The standardization sample represented the population census in the Scandinavian countries regarding sex and educational level. There were some differences in the inclusion criteria between the standardization and the present sample. The Scandinavian standardization sample included children (13%) with a language other than Swedish, Norwegian, or Danish, respectively, spoken in the home. This represented the number of Scandinavians born abroad (Wechsler, Citation2016). In the present study, only Swedish- or bilingual Swedish and Finnish-speakers were included. Children with mild intellectual disability (1.8%) and intellectual giftedness (2.3%) were also included in the standardization sample. Similar groups were not included in the present study.

In the Swedish version of the WISC-V, the Information subtest has some country specific questions. For the present study, these questions were changed to refer to Finland instead of Sweden and the total score of the subtest was calculated using the Finnish questions. However, also the original questions regarding Sweden were asked to allow for follow-up comparison. Further, some minor linguistic changes were made to words and questions in order to correspond to the Swedish used in Finland. All changes were made according to the Statement of Work No. 296412–2 to Master License Agreement No. LSR–111089 with NCS Pearson.

Background variables

The parents of the participating children filled out a background information form regarding the child’s language background (e.g., if the child was mono- or bilingual, as well as the child’s home language and the language of and between parents), information about school (starting age, support in school), educational level of the parents, as well as other information needed to assess the inclusion/exclusion to the study.

Procedures

All children were recruited from Swedish-speaking schools and pre-primary units in mainland Finland. The Finland-Swedes primarily live in the coastal areas of Finland, which for recruitment purposes were divided into four regions. A total of 15 municipalities were selected to represent these Finland-Swedish regions. The selection process took into account the proportion of Finland-Swedish school children living in urban, semi-urban, and rural areas within these regions, as well as the educational level of the region. Schools and pre-primary school centers were, for practical reasons, selected within the municipalities by the psychologists conducting the participant selection. Within each educational unit, children were selected to the study through randomized sampling. The parents of the selected children were invited to the study by a written letter.

In Finland, pre-primary education begins in the fall of the year the children turn six, and first grade accordingly in the year they turn seven. Children in pre-primary education and in school grades 2, 4, 6, and 9 were initially recruited to the assessments during the school year 2019–2020. Due to the COVID-19 pandemic, recruitment and some assessments were partly interrupted or postponed in the spring 2020 and continued in the following school year. The recruitment was finalized in the school year of 2020–2021, when additional children (n = 21) were recruited in order to reach the preset goal regarding number of participants.

The assessments were conducted in the children’s schools and pre-primary facilities by a total of 19 clinical psychologists and six research assistants. They participated in a test education prior to conducting the assessments. The research assistants, who were psychology graduate students, received additional training and supervision. The assessments were typically divided into three sessions (range = 1–5, data missing for one participant).

The children were assessed between October 2019 and May 2021. For some children, the assessment process was interrupted due to the COVID-19 pandemic, and the data collection continued at a later time point. If the break was two months or longer (n = 6), a new test age was calculated for the scoring of the subtests administered after the interval. The break for these individuals was on average 4.33 months (range 2–10 months). Further, due to the pandemic, the test administrator wore a face mask during the assessments of 27 children (20.1%). To assess possible effects of mask use on cognitive test scores, these 27 children were compared to a control group matched on age, sex, highest parental education, and bilingualism, for which the administrator did not wear a mask during the assessment (n = 27). No significant group differences were found for the subtests (t = −1.09–0.85, p = .279–.863, d = −0.30–0.23, mean differences: −0.89–0.59), indexes (t = −0.95–0.24, p = .346–.977, d = −0.26–0.07, mean differences − 3.63–0.78), or the FSIQ (t = −0.60, p = .555, d = −0.16, mean difference = −2.11, with a higher score for assessments with a mask).

Ethical approval was granted in June 2019 by the Ethical Review Board in the Humanities and Social and Behavioral Sciences at the University of Helsinki. Approval for data collection was granted by the heads of the municipalities and all educational units. The parents as well as all children aged 15 or older gave written consent. The principles stated in the Declaration of Helsinki were followed throughout the study.

Data preparation and statistical analyses

Prior to analyses, missing data for the WISC-V subtests were imputed for the complete WISC-V data (N = 197) using Expectation Maximization estimation with age, sex, and other subtest scores included. Within the present data (N = 134), missing information was imputed on the item level in five subtests for 1–2 participants per subtest. In all, data was imputed for eight participants (6.0%), for one WISC-V subtest per participant. All final values corresponded to the expected ranges.

Maternal and paternal education levels were transformed from six- to three-point scales. This was done in order to harmonize the years of education received regardless of type of school attended, to merge the data due to few observations in some of the education groups, and to construct a variable feasible for analyses. For the statistical analyses, maternal and paternal education levels were further combined into one variable, consisting of the highest educational level of the parents. The final parental education variable consisted of the following three categories: Low education level included a three-year upper secondary or vocational education, or lower; Medium education level included a three-year University of Applied Sciences or a Bachelor’s degree; and High education level included a Master’s degree, or higher.

Based on graphical inspection, assumptions for normal distribution were met in all variables of interest. In addition, skewness and kurtosis values were calculated. Recommended thresholds for skewness are < 2 and for kurtosis < 7 (for a discussion, see West et al., Citation1995). To explore the distribution of the present sample, the variance was compared to the expected SD of 3 for subtests and 15 for indexes, using one-sample χ2-square tests, with Bonferroni correction. The analyses were first conducted for the five indexes together with FSIQ and then for the 13 subtest scores using the R EnvStats-package (Millard, Citation2013).

A confirmatory factor analysis of invariance was undertaken to evaluate if the constructs measured by the Swedish WISC-V were equivalent (invariant) between the standardization data and this FinSwed Study sample (N = 134; for a discussion regarding measurement invariance, see e.g., Pauls et al., Citation2020; Rodríguez-Cancino & Concha-Salgado, Citation2023). The analyses were undertaken with the five primary index scores and 13 subtest scaled scores using the lavaan R-package (Rosseel, Citation2012). For the standardization sample (N = 660), intercorrelations between subtest scaled scores and index scores presented in the Swedish manual were used (Table 2.8 in Wechsler, Citation2016). First, in order to assess if the structure of the Swedish WISC-V was stable between groups, no restrictions were applied to the parameters of the models (configural invariance). Then, the factor loadings were restricted, and the fits were compared with the unconstrained model (metric invariance). Lastly, restrictions were also applied to the intercepts, and the fit was compared with the model with restricted loadings (scalar invariance). To establish invariance in the model comparisons, the χ2-difference test, the comparative fit index (CFI), the root mean square error of approximation (RMSEA), and the standardized residual root mean square (SRMR) were considered. The following cutoffs were used, as suggested for small sample sizes: A change of ≥ −0.005 in CFI and ≥ 0.010 in RMSEA indicated non-invariance. For SRMR, a change of ≥ 0.025 for metric invariance and ≥ 0.010 for scalar invariance indicated non-invariance (Chen, Citation2007).

For the main analyses, the performance of the Finland-Swedish sample on the Swedish WISC-V was analyzed using one-sample t-tests with Bonferroni corrections. The scores of the sample were compared to the mean standard score of 10 or 100 for subtests and indexes, respectively, using the R EnvStats-package (Millard, Citation2013). Associations between the FSIQ/index scores and background factors were analyzed with linear multiple regression analyses. The background factors included as independent variables were age at assessment, sex, bilingualism, and parental education level.

In the Information subtest, two questions were asked for items 15 and 29 with culture-specific factual content: One modified question regarding Finland, as well as the original question regarding Sweden. For copyright reasons, the specific questions cannot be published here. The items were scored with 1 (correct) or 0 (not correct). In the main analyses, the total score of the subtest was based on the question regarding Finland. However, as a follow-up analysis, the children’s performance on the Finnish vs. the Swedish questions were compared using exact χ2-tests.

Statistical analyses were completed using IBM SPSS Statistics version 29 and the R software version 4.05 (R Core Team, Citation2021). All tests of significance were two-sided (p < .05). R2 and Cohen’s d served as indicators of effect size. The cut off values for Cohen’s d were 0.20 (small effect), 0.50 (medium effect), and 0.80 (large effect; Cohen, Citation1988).

Results

Distribution of the WISC-V scores and confirmatory factor analysis

Regarding skewness and kurtosis, all values were well within the accepted range. Skewness ranged from − 0.15 to 0.58 (subtests) and from − 0.17 to 0.39 (indexes). Kurtosis ranged from − 0.88 to 1.17 (subtests) and − 0.46 to 0.78 (indexes). For FSIQ, skewness was −0.05 and kurtosis − 0.26. Also, the standard deviation of The FinSwed Study sample was as expected, except in the FSIQ (SD = 12.45, χ2 (133) = 91.55, p = .028) and the subtest Similarities (SD = 2.25, χ2 (133) = 74.82, p < .001).

The confirmatory factor analysis found evidence of equivalence for the unconstrained model, that is, configural invariance was reached. Further, both the model with restricted loadings as well as the unconstrained model fit the data (). However, in the model that also restricted the intercepts, the scalar invariance level was not reached, indicating that The FinSwed Study sample did not have the same mean values as the Scandinavian norms on the Swedish WISC-V. In sum, metric invariance was reached, but scalar invariance was not.

Table 3. WISC-V confirmatory analysis of invariance between the Scandinavian standardization data and The FinSwed Study sample (N = 134).

Performance on the WISC-V in relation to the population normative mean

The performance of the 134 Finland-Swedish children on the Swedish WISC-V, when scaled using the Scandinavian norms, are presented in . In the one-sample t-tests with Bonferroni corrections, the Finland-Swedish children performed significantly higher than the normative mean on the subtests Similarities, Block Design, Visual Puzzles, Figure Weights, Arithmetic, and Picture Span, but lower than expected on the Vocabulary subtest (see ). Consequently, the performance was significantly higher than the Scandinavian normative mean on the indexes Visual Spatial Index (VSI), Fluid Reasoning Index (FRI), and Working Memory Index (WMI) as well as on the FSIQ. Effect sizes were small for all significant measures, except for the subtest Similarities, which had a medium effect size.

Relationship with background variables

The relationship between performance on the Swedish WISC-V indexes and background variables was investigated using multiple linear regression analyses. The regression analyses were significant for the VCI (F (5, 133) = 4.29, p = .001), Processing Speed Index (PSI; F (5, 133) = 3.88, p = .003), and FSIQ (F (5, 133) = 3.60, p = .004), but not for the VSI, FRI, or WMI (p = .198–.394). The significant regression analyses are presented in . Even when controlling for other background variables, low parental education was the most important explaining factor for lower test scores, significantly for the VCI and the FSIQ (see ). Older children received significantly higher scores than younger children on the VCI and the FSIQ (VCI being 95.77 vs. 103.04 and FSIQ being 96.96 vs. 104.31 for the youngest and oldest groups, respectively). The theoretical mean score increase (B-value) was 0.77 (VCI) and 0.71 (FSIQ) points per year. Boys had significantly lower scores on the PSI compared to girls. Bilingualism was not significantly associated with the indexes. In total, 8.9–11.0% of the variance in VCI, PSI, and FSIQ was explained by the independent variables chosen for the regression analyses.

Table 4. Significant relationships between background variables and performance of The FinSwed Study sample (N = 134).

Country-specific questions

For the two items with country-specific content in the Information subtest, the proportions of correct responses to the questions regarding Finland and the questions regarding Sweden were compared. Finland-Swedish children were significantly better at answering the questions regarding Finland, both for item 15 (28.6% vs. 10.7%; χ2 (1, n = 112) = 19.75, p < .001) and for item 29 (55.6% vs. 33.3%; χ2 (1, n = 36) = 5.63, p = .032).

Discussion

In the present study we investigated the generalizability of the Swedish version of WISC-V to the minority group of Swedish-speakers in Finland. The typically developing 6–16-year-old Finland-Swedes in this sample were Swedish-speakers or simultaneous Swedish-Finnish bilinguals, and attended school or pre-primary education in Swedish, with the Finnish curriculum. The Swedish spoken in Finland differs to some extent from the Swedish spoken in Sweden (e.g., Sjöholm, Citation2004) but the Wechsler scales have not been standardized for the specific cultural group of Finland-Swedes. Therefore, Swedish test versions with Scandinavian normative data are often used with Finland-Swedes (Rosenqvist et al., Citation2022), but information on the validity of the test in this cultural minority group has been lacking.

The results showed that the Finland-Swedish children had significantly higher performance on the Swedish WISC-V as compared to the Scandinavian normative mean. However, there was variation in the performance, particularly between verbal subtests. The VCI and the FSIQ showed most significant relationships with background variables, as higher scores were associated with higher parental education level as well as with increasing age. The findings of this study have implications for clinical assessments with Finland-Swedish children, but also more generally for the cultural neurocognitive field. First, we discuss the primary findings of the study. Then, we provide some possible explanations for the results.

Overall differences in the Finland-Swedish sample

The Finland-Swedish sample in this study had a mean FSIQ of three points higher compared to the Scandinavian normative mean, a difference which was statistically significant with a small effect size. This difference was somewhat in line with WISC FSIQ mean scores of ca. 105 reported in similar previous cultural comparison studies (Gudmundsson et al., Citation2005–2006; Reverte et al., Citation2015), although also smaller significant differences have been reported (FSIQ 101 in Babcock et al., Citation2018). The finding also corresponds with results found among younger children in The FinSwed Study: The FSIQ of 5–6-year-old Finland-Swedes was 105 on the WPPSI-IV, when parental education level was matched to census (Salonen et al., Citation2023b). Small effect sizes for significant measures have also been reported in other cultural comparison studies using different versions of the WISC (Babcock et al., Citation2018; Grob et al., Citation2008). In fact, small effect sizes might even be expected when studying the relationships between psychological measures (for a discussion, see Bialystok, Citation2021).

The test scores of the present sample were normally distributed and had the expected variation, with two exceptions. The standard deviation was significantly smaller for the FSIQ and for the subtest Similarities compared to the normative sample. The difference in standard deviations may relate to the fact that the Scandinavian standardization included a small group of gifted children and children with mild intellectual disability (Wechsler, Citation2016), which increases variation. Similar samples were not assessed within the scope of the present study.

Further, the factor structure of the sample was analyzed in a confirmatory factor analysis. The results confirmed that the subtests have the same factor loadings, i.e., load on the same indexes in The FinSwed Study sample as in the Scandinavian normative data. However, the results also indicated that the groups did not have the same mean values on the WISC-V, which implied that the norms may not be directly applied to the Finland-Swedish minority. As such, this further warranted the present investigation of the specific performance differences between the Finland-Swedish sample and the Scandinavian norms.

Verbal functions

The Finland-Swedish children’s performance on the verbal subtest Similarities was significantly higher than the norms, whereas no significant differences emerged for the verbal subtests Information or Comprehension. Vocabulary was, on the other hand, significantly lower than the normative mean, and it was the subtest generating the lowest scaled score in the present study, landing on average slightly below the scaled score 9. The relative difficulty of this subtest has also been a clinical observation among Finland-Swedish psychologists (Rosenqvist et al., Citation2022). It is possible that the choice and order of words in the Vocabulary subtest does not reflect the frequency or the correct order based on difficulty for Finland-Swedish children. Further, the VCI was not significantly different from the normative mean – with a mean of 101 it was in fact the index closest to the mean. However, the VCI is composed of the subtests Similarities (M = 11.37) and Vocabulary (M = 8.86), thus averaging these scores out to explain the nonsignificant finding.

The present results differ somewhat from a previous finding conducted with 10-year-old Finland-Swedish children, where a translation of the Finnish version of WISC-IV seemed to produce lower scores than expected based on the norms in all verbal subtests and the VCI (Delatte, Citation2015). Language tasks are often viewed as being high on linguistic demand and cultural loading (see Flanagan et al., Citation2013) and it is possible that the lower scores in the previous study also reflected the difficulty of translating verbal tests. Further, in the present study, of all indexes, the VCI had the most significant relationships with the studied background variables: There were significant associations with both parental education level and age. In all, these findings indicate that different cultural factors might influence linguistic cognitive performance when tasks are used across countries and cultures, also when the children are assessed in the language the test is constructed in. However, this may not necessarily apply to all cultural groups and minorities. For instance, when German and Swiss children were compared on the WISC-IV, the only significant differences were in a visual task and index (Grob et al., Citation2008). In the study by Grob et al. (Citation2008), the groups that were compared were matched on age, sex, and parental education, which was not the case in the present study, thus perhaps partly accounting for the differences between the results.

In the Information subtest, some country-specific questions are included. We changed these questions to concern Finland, as was also done regarding Norway, Denmark, and Sweden in the Scandinavian standardization (Wechsler, Citation2016), as well as in other standardizations, such as for the Canadian test version (Cormier et al., Citation2016). However, also the Sweden-specific questions were asked for follow-up comparison. The Finland-Swedish children were significantly more knowledgeable when asked questions regarding Finland than regarding Sweden. Only the questions about Finland were included when calculating the subtest scaled score and as the Information subtest in our study was close to the Scandinavian mean of 10, the results support using questions concerning the country at hand. Thus, on a general level, the findings imply that adaptations to the specific cultural context are called for.

Non-verbal functions

The indexes VSI and FRI both differed significantly from the test norms. The Finland-Swedish children performed on average five and four index points, respectively, above the Scandinavian mean. Additionally, the Finland-Swedish children performed significantly higher than the normative mean on the subtests Block Design, Visual Puzzles, Figure Weights, and Arithmetic. Only the Matrix Reasoning subtest did not significantly differ from the normative mean. Previously, differences have also been found in visual tasks between other countries where children speaking the same language were compared. For instance, Canadian children performed significantly higher than U.S. children on the subtests Block Design and Visual Puzzles, as well as on the corresponding WISC-V index VSI (Babcock et al., Citation2018). However, while there was a significant difference for the subtest Arithmetic in their study, the groups did not significantly differ on the FRI or the fluid subtests Matrix Reasoning and Figure Weights (Babcock et al., Citation2018). Further, Swiss children outperformed German children on the WISC-IV Block Design subtest and the Perceptual Reasoning Index (Grob et al., Citation2008). Combined, the present and previous findings thus indicate that performance differences on visual tasks can emerge between cultures. Still, previous findings vary, as cultural differences have not been found on all visual tasks (e.g., Babcock et al., Citation2018; Grob et al., Citation2008).

Working memory and processing speed

The WMI of the Finland-Swedish children was significantly higher than the Scandinavian norms, by around five points. When looking at the subtests included in the WMI, the Finland-Swedish children’s performance on Picture Span was significantly higher than the normative mean, while the performance difference on Digit Span was nonsignificant. In previous studies, no significant differences were reported for the WMI or its included subtests between Canadian and U.S. children on the WISC-V (Babcock et al., Citation2018) or between Swiss and German children on the WISC-IV (Grob et al., Citation2008).

On the PSI, the Finland-Swedish children’s performance did not significantly differ from the Scandinavian normative mean. Previously, culture-related differences in processing speed have been shown, often with North Americans outperforming individuals from other countries on timed tasks (e.g., Agranovich et al., Citation2011; Eizaguirre et al., Citation2020). Still, the performance on the PSI in Wechsler Adult Intelligence Scale-IV has been shown to be fairly similar between Finnish and Scandinavian adults (Roivainen, Citation2019). Perhaps such differences relating to speed are not as evident between the Finnish and Scandinavian cultures.

Relationships with background factors

The difference between the Finland-Swedish children’s performance and the Scandinavian norms significantly increased with age for the VCI and the FSIQ, despite using scores scaled on age and while controlling for other background variables. The youngest children performed below the normative mean, and the scaled scores increased with 0.71 and 0.77 points per year, respectively, in the regression analyses. Previous cross-cultural studies with different WISC versions have generally not investigated possible age effects. However, when comparing Australian children to the U.S. norms on several tests, Kamieniecki and Lynd-Stevenson (Citation2002) reported scores separately for different age groups. They found a significant correlation between age and age-standardized test scores, with differences in scores between the Australian and U.S. children decreasing with age. Therefore, age effects may differ depending on the groups assessed. As such, the present surprising finding indicates that cultural differences may be larger or smaller depending on the age group assessed and that age effects should be investigated in future studies. From a clinical perspective, this age effect for the Finland-Swedes on the Swedish WISC-V is important to consider in clinical assessments, both with the youngest age groups that received scores below the mean and the oldest age groups that scored above the mean.

High parental education level was an important predictor of the Finland-Swedish children’s performance, as it related to significantly higher scores on the VCI and the FSIQ. The association between parental education and the FSIQ and indexes of the WISC-V has been established (Hernández et al., Citation2017; Weiss, Locke, et al., Citation2019). The present findings also corresponded with previous European studies with the WISC-III, where a strong relationship between parental education and especially verbal reasoning has been found (Cianci et al., Citation2013; Eilertsen et al., Citation2016).

Sex differences were only found in one index, the PSI, where girls outperformed boys. This was in line with a recent meta-analysis by Giofrè et al. (Citation2022), that compared sex differences on different versions of the WISC in typically developing children. In general, they found fewer sex differences for newer test versions as compared to older versions, and the largest differences were found in processing speed, favoring girls (Giofrè et al., Citation2022).

Of the participants in the present study, 44% were simultaneous bilinguals, speaking both Swedish and Finnish at home. While no official statistics regarding the number of bilinguals in Finland exist, this corresponds to the estimated percentage in the population (Saarela, Citation2021) as well as to another recently collected randomized cohort of Finland-Swedes (Vataja et al., Citation2022). In the present study, learning two languages simultaneously was related to lower scores on the VCI, but not significantly. This was in line with a previous study within the Finland-Swedish context, in which no significant differences between language status and verbal scores on the WISC-IV in 7- and 10–11-year-old children were found (Karlsson et al., Citation2015). However, in other Finland-Swedish studies, bilingualism has been shown to negatively relate to some of the assessed tasks of vocabulary and expressive language in younger children and in adults (Korkman et al., Citation2012; Korpinen et al., Citation2023; Leinonen & Tandefelt, Citation2007; Vataja et al., Citation2022; Westman et al., Citation2008). The relationship between bilingualism and performance on the Swedish WISC-V will be further investigated within The FinSwed Study.

Possible explanations for the performance differences

Sample selection

In order to represent the minority of Finland-Swedes specifically, only children who were Swedish speakers or simultaneous bilinguals (Swedish-Finnish speakers) were included in this study. This differed from the Scandinavian WISC-V standardization, which included children from families with a home language other than the majority language, in a proportion corresponding to the Scandinavian census (13%; Wechsler, Citation2016). The present exclusion criteria were constructed to as closely as possible match the criteria used in the Scandinavian standardization. However, due to some differences between the countries, the criteria were not completely identical. In the Scandinavian standardization, children with formal diagnoses were excluded. There are differences in diagnostical procedures between the Scandinavian countries and Finland: In Finland, only medical doctors can give diagnoses, which is not the case in the Scandinavian countries. This was believed to lead to a higher number of children with formal diagnoses in Scandinavia than in Finland. Thus, as an attempt to match the exclusion criteria used in the Scandinavian standardization, all children that were enrolled in formal support in school were excluded in the present study. The numbers of children excluded in this study versus the standardization cannot be compared since this data is not available for the standardization sample. However, compared to the number of students enrolled in formal support in Finland-Swedish schools (17.1%; Official Statistics of Finland, Citation2019), fewer students than expected were excluded due to the need of support in the present sample (11.7%). While it is possible that the present exclusion criteria contributed to a larger number of children with highly educated parents participating in the study, it is also possible that highly educated parents were more inclined to enroll their children in the study than parents with a lower education. However, in order to reduce the effects of the overrepresentation of parents with a higher university education, cases within the initial sample were excluded until the parental education level of the data matched the population census for Finland-Swedes.

Cultural and educational factors

The performance difference between the Finland-Swedes and the Scandinavian norms may also reflect social and cultural factors in the population of Finland-Swedes. The Finland-Swedes have a generally high educational level, somewhat higher than the Finnish-speaking population and the Scandinavian countries (Eurostat, Citation2022; Official Statistics of Finland, Citation2017; Saarela, Citation2021). The group is fairly homogenous and the social and cultural capital may lead to proportionally more opportunities for higher education, a relatively low unemployment rate (Saarela, Citation2021), and opportunity for participation in cultural events. Such factors relating to society, wealth, and level of and attitudes toward education in the society may support the development of cognitive skills (see also discussion in Rindermann, Citation2007).

Additionally, there are some performance differences between Finland and the Scandinavian countries in the PISA assessments (Leino et al., Citation2019). As we hypothesized, such educational factors may also at least to some extent account for the present findings, as student assessment results relate to performance on intelligence tests (Rindermann, Citation2007). In the PISA assessments, Finland-Swedish children performed higher than their Scandinavian peers in mathematics and science (Leino et al., Citation2019, see also ). The PISA science tasks measure the ability to reflect and engage in scientific discourse (OECD, Citation2019), and abstract reasoning skills in addition to mathematical reasoning are also needed in the indexes FRI and VSI, and in the subtests included in these indexes, as well as in Similarities and Arithmetic (Wechsler et al., Citation2014). As the Finland-Swedish children performed significantly higher than the Scandinavian norms in these indexes and subtests, it is possible that differences in school systems and educational factors between the countries explain some of the performance differences found – perhaps abstract reasoning and other skills needed for these tasks are emphasized in the Finnish curriculum.

Also, it should be noted that the age effect observed in the VCI and the FSIQ, with older children performing significantly higher than younger children, was significant even when other background factors were controlled for. In fact, one possible explanation for the increase in test scores with age could be that it is influenced by the high-quality educational system, the positive effect of which possibly increases with years spent in the educational system.

The Flynn effect

The higher performance of the Finland-Swedes may also partly be accounted for by the Flynn effect, which is defined as the annual rise of IQ scores in the population (see e.g., Pietschnig & Voracek, Citation2015; Trahan et al., Citation2014). The Scandinavian norms were collected 4–6 years prior to collection of the present data (2015–2016 vs. 2019–2021). Performance on modern tests (normed since 1972), including the Wechsler scales, has in a meta-analysis of English-speaking samples shown to increase by close to three IQ points per decade (Trahan et al., Citation2014). A similar increase was reported in another meta-analysis, spanning over a century and consisting of samples from several countries and continents. The increase was stronger for visual reasoning tasks than for tasks measuring factual and verbal knowledge (Pietschnig & Voracek, Citation2015). Further, when comparing the performance of two matched samples collected 12 years apart on the same test – the WISC-IV – the gain for FSIQ was 0.31 points per year, but the Flynn effect varied for the different composite scores, with the largest difference being in the Perceptual Reasoning Index, consisting of non-verbal subtests (Weiss et al., Citation2016). Additionally, the Flynn effect has been shown to be strong for the Similarities subtest (Flynn & Weiss, Citation2007; Weiss et al., Citation2016). While Trahan et al. (Citation2014) reported no age effects, Pietschnig and Voracek (Citation2015) reported the Flynn effect to be stronger for adults than for children. In fact, the effect has been suggested to vary, for instance, for different countries or areas, time span, age groups, domains of intelligence, as well as ability level or sex of participants, sometimes plateauing or decreasing, and to be affected by the difficulty level of task items changing over time (e.g., Gonthier & Grégoire, Citation2022; Lazaridis et al., Citation2022; Pietschnig & Voracek, Citation2015; Platt et al., Citation2019; Sundet et al., Citation2004; Weber et al., Citation2017). The Flynn effect has, to our knowledge, not been investigated with Finnish or Scandinavian child samples, but it is possible that the effect partly explains the higher scores of the present sample as compared to the normative mean, particularly in the Similarities subtest, the VSI, and the FRI. In fact, in the present study, the highest effect size (medium) was seen in the Similarities subtest.

Conclusions

The present study investigated the generalizability of the Swedish WISC-V with Scandinavian norms to the Finland-Swedish minority group (Swedish-speakers in Finland). Since this minority generally is highly educated and has shown good educational outcomes in international comparisons, favorable cognitive outcomes were expected. While a similar factor structure in the Finland-Swedish population as in the Scandinavian test norms was confirmed, the factor analysis also suggested that the standard scores are not directly generalizable to the minority group. Indeed, the Swedish WISC-V generated higher scores for 5–16-year-old Swedish-speaking Finnish children overall, with the FSIQ and most indexes landing on 102–105. The performance difference was especially evident in the nonverbal and working memory indexes but varied in the verbal subtests. Surprisingly, the difference to the Scandinavian norms increased with age in VCI and the FSIQ. The present findings indicated that cognitive difficulties may be overlooked due to the generally high test performance among the Finland-Swedes, particularly among older children.

The specific information as to how the performance differs between the norms and the minority group presented in this study can directly be used in clinical practice. Based on these findings, national recommendations have been constructed and published for Finland-Swedish clinical assessments (Haavisto et al., Citation2023). Scaling test results using the Scandinavian norms found in the test manual and then comparing them to the described research findings allows for more accurate and fair clinical assessments for Finland-Swedish children.

In all, the results show that performance on cognitive tests may differ even when the compared groups have the same language and come from neighboring countries with partly shared cultures. For many smaller cultural groups, developing and adapting test versions is often not feasible. Therefore, it is important to conduct comparative studies and for clinicians to apply such results in their practice. In such studies, making cultural and linguistic adaptations to the test materials used is of importance. Further, the significant age differences in VCI and FSIQ call for possible age differences to be considered in future cultural studies. In conclusion, cultural groups lacking test versions and normative data are advised to assess the generalizability of the available measures and to construct clinical recommendations in order to establish test fairness for the specific cultural group.

Acknowledgments

We thank all children and families for participating in the assessments as well as all clinical psychologists and research assistants for conducting the assessments. We are also grateful to the municipalities, schools, and pre-primary facilities for opening their doors and thereby enabling the data collection for The FinSwed Study.

We thank Lecturers of Psychometrics Jari Lipsanen and Markku Kilpeläinen for helping with statistical analyses.

Disclosure statement

Minor cultural and language-related modifications to some WISC-V questions were made in the study. These were made in accordance with Statement of Work No. 296412-2 to Master License Agreement No. LSR–111089 with NCS Pearson. The authors have no conflicts of interest to declare.

Additional information

Funding

The FinSwed Study was funded by The Swedish Cultural Foundation in Finland, the association Svenska folkskolans vänner, and the foundations Stiftelsen Brita Maria Renlunds minne sr, and Oskar Öflunds stiftelse.

References

  • Agranovich, A. V., Panter, A. T., Puente, A. E., & Touradji, P. (2011). The culture of time in neuropsychological assessment: Exploring the effects of culture-specific time attitudes on timed test performance in Russian and American samples. Journal of the International Neuropsychological Society, 17(4), 692–701. https://doi.org/10.1017/S1355617711000592
  • American Educational Research Association, American Psychological Association, & Education, N. C.o. M. i. (2014). The standards for educational and psychological testing. www.apa.org/science/programs/testing/standards
  • American Psychological Association. (2017). Multicultural guidelines: An ecological approach to context, identity, and intersectionality. https://www.apa.org/about/policy/multicultural-guidelines
  • Babcock, S. E., Miller, J. L., Saklofske, D. H., & Zhu, J. (2018). WISC-V Canadian norms: Relevance and use in the assessment of Canadian children. Canadian Journal of Behavioural Science/Revue Canadienne des Sciences du Comportement, 50(2), 97. https://doi.org/10.1037/cbs0000096
  • Benson, N. F., Floyd, R. G., Kranzler, J. H., Eckert, T. L., Fefer, S. A., & Morgan, G. B. (2019). Test use and assessment practices of school psychologists in the United States: Findings from the 2017 national survey. Journal of School Psychology, 72, 29–48. https://doi.org/10.1016/j.jsp.2018.12.004
  • Bialystok, E. (2021). Bilingualism as a slice of Swiss cheese. Frontiers in Psychology, 12, 5219. https://doi.org/10.3389/fpsyg.2021.769323
  • Byrne, B. M. (2016). Adaptation of assessment scales in cross-national research: Issues, guidelines, and caveats. International Perspectives in Psychology: Research, Practice, Consultation, 5(1), 51. https://doi.org/10.1037/ipp0000042
  • Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464–504. https://doi.org/10.1080/10705510701301834
  • Cianci, L., Orsini, A., Hulbert, S., & Pezzuti, L. (2013). The influence of parents’ education in the Italian standardization sample of the WISC-III. Learning and Individual Differences, 28, 47–53. https://doi.org/10.1016/j.lindif.2013.09.009
  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates, Inc.
  • Cormier, D. C., Kennedy, K. E., & Aquilina, A. M. (2016). Test review: Wechsler D.(2014),” Wechsler intelligence scale for children: Canadian 322 (WISC-V CDN).” Toronto, Ontario: Pearson Canada assessment. Canadian Journal of School Psychology, 31(4), 322–334. https://doi.org/10.1177/0829573516648941
  • Delatte, S. (2015). An evaluation of an adapted version of WISC-IV for 10- and 11-year-old Swedish-speaking children in Finland [Master’s thesis, Åbo Akademi University]. ALMA - Search Portal for Åbo Akademi University Library, E-Resources and Archive Collections.
  • Eilertsen, T., Thorsen, A. L., Holm, S. E. H., Bøe, T., Sørensen, L., & Lundervold, A. J. (2016). Parental socioeconomic status and child intellectual functioning in a Norwegian sample. Scandinavian Journal of Psychology, 57(5), 399–405. https://doi.org/10.1111/sjop.12324
  • Eizaguirre, M. B., Vanotti, S. I., Aguayo Arelis, A., Rabago Barajas, B., Cores, E. V., MacíMacíAs, M. A., Benedict, R. H., & Cáceres, F. (2020). Symbol digit modalities test-oral version: An analysis of culture influence on a processing speed test in Argentina, Mexico, and the USA. Developmental Neuropsychology, 45(3), 129–138. https://doi.org/10.1080/87565641.2020.1737699
  • Eurostat. (2022). Population by educational attainment level, sex and age (%) - main indicators. https://ec.europa.eu/eurostat/databrowser/view/EDAT_LFSE_03__custom_3991061/default/table
  • Evers, A., Muñiz, J., Bartram, D., Boben, D., Egeland, J., Fernández-Hermida, J. R., Frans, Ö., Gintiliené, G., Hagemeister, C., Halama, P., Iliescu, D., Jaworowska, A., Jiménez, P., Manthouli, M., Matesic, K., Schittekatte, M., Sümer, H. C., & Urbánek, T. (2012). Testing practices in the 21st century: Developments and European psychologists’ opinions. European Psychologist, 17(4), 300. https://doi.org/10.1027/1016-9040/a000102
  • Flanagan, D., Ortiz, S., & Alfonso, V. (2013). Cross-battery assessment of individuals from culturally and linguistically diverse backgrounds. In D. Flanagan, S. O. Ortiz, & V. C. Alfonso (Eds.), Essentials of cross-battery assessment (pp. 287–350). John Wiley & Sons.
  • Flynn, J. R., & Weiss, L. G. (2007). American IQ gains from 1932 to 2002: The WISC subtests and educational progress. International Journal of Testing, 7(2), 209–224. https://doi.org/10.1080/15305050701193587
  • Georgas, J., Van de Vijver, F., Weiss, L. G., & Saklofske, D. (Eds.). (2003). A cross-cultural analysis of the WISC-III. In Culture and children’s intelligence (pp. 277–313). Elsevier.
  • Gienger, C., Petermann, F., & Petermann, U. (2008). Wie stark hängen die HAWIK-IV-Befunde vom Bildungsstand der Eltern ab? [To what extent do the HAWIK-IV findings depend on the level of education of the parents?]. Kindheit und Entwicklung, 17(2), 90–98. https://doi.org/10.1026/0942-5403.17.2.90
  • Giofrè, D., Allen, K., Toffalini, E., & Caviola, S. (2022). The impasse on gender differences in intelligence: A meta-analysis on WISC batteries. Educational Psychology Review, 34(4), 1–26. https://doi.org/10.1007/s10648-022-09705-1
  • Gonthier, C., & Grégoire, J. (2022). Flynn effects are biased by differential item functioning over time: A test using overlapping items in Wechsler scales. Intelligence, 95, 101688. https://doi.org/10.1016/j.intell.2022.101688
  • Grob, A., Petermann, F., Lipsius, M., Costan-Dorigon, J., Petermann, U., & Daseking, M. (2008). Differences in Swiss and German children’s intelligence as measured by the HAWIK-IV. Swiss Journal of Psychology/Schweizerische Zeitschrift für Psychologie/Revue Suisse de Psychologie, 67(2), 113. https://doi.org/10.1024/1421-0185.67.2.113
  • Gudmundsson, E. (2009). Guidelines for translating and adapting psychological instruments. Nordic Psychology, 61(2), 29–45. https://doi.org/10.1027/1901-2276.61.2.29
  • Gudmundsson, E., Claessen, À., Àsgeirsdóttir, B., & Þor Gudmundsson, B. (2005–2006). Notagildi erlendra stadla vid túlkun nidurstadna úr WISC-III á Íslandi [Applicability of foreign standards when interpreting WISC-III results in Iceland]. Sálfraediritid - Tímarit Sálfreadingafélags Íslands 10–11, 41–49.
  • Haavisto, A., Slama, S., Rosenqvist, J., & Collaboration with the Test Committee at the Finnish psychological association. (2023). Riktlinjer för kognitiva utredningar av barn i Svenskfinland [Guidlines for cognitive assessments of children in Swedish-speaking Finland].
  • Hernández, A., Aguilar, C., Paradell, È., Muñoz, M. R., Vannier, L.-C., & Vallar, F. (2017). The effect of demographic variables on the assessment of cognitive ability. Psicothema, 29(4), 469–474. https://doi.org/10.7334/psicothema2017.33
  • Hiltunen, J., Ahonen, A., Hienonen, N., Kauppinen, H., Kotila, J., Lehtola, P., Leino, K., Lintuvuori, M., Nissinen, K., Puhakka, E., Sirén, M., Vainikainen, M.-P., & Vettenranta, J. (2023). PISA 22: ensituloksia [PISA 22: First results]. Opetus-ja kulttuuriministeriön julkaisuja, 49, 1–156. https://urn.fi/URN:ISBN:978-952-263-949-3
  • International Test Commission. (2001). International guidelines for test use. International Journal of Testing, 1(2), 93–114. https://doi.org/10.1207/S15327574IJT0102_1
  • International Test Commission. (2013). ITC guidelines on test use. Version 1.2. https://www.intestcom.org/files/guideline_test_use.pdf
  • Kamieniecki, G. W., & Lynd-Stevenson, R. M. (2002). Is it appropriate to use United States norms to assess the “intelligence” of Australian children? Australian Journal of Psychology, 54(2), 67–78. https://doi.org/10.1080/00049530210001706523
  • Karlsson, L. C., Soveri, A., Räsänen, P., Kärnä, A., Delatte, S., Lagerström, E., Mård, L., Steffansson, M., Lehtonen, M., Laine, M., & Morton, J. B. (2015). Bilingualism and performance on two widely used developmental neuropsychological test batteries. Public Library of Science ONE, 10(4), e0125867. https://doi.org/10.1371/journal.pone.0125867
  • Korkman, M., Kirk, U., & Kemp, S. L. (2011). NEPSY-II. Svensk version. Manual [NEPSY-II. Swedish version. Manual]. Pearson.
  • Korkman, M., Stenroos, M., Mickos, A., Westman, M., Ekholm, P., & Byring, R. (2012). Does simultaneous bilingualism aggravate children’s specific language problems? Acta Paediatrica, 101(9), 946–952. https://doi.org/10.1111/j.1651-2227.2012.02733.x
  • Korpinen, E., Slama, S., Rosenqvist, J., & Haavisto, A. (2023). WPPSI-IV and NEPSY-II performance in mono- and bilingual 5–6-year-old children: Findings from the FinSwed study. Scandinavian Journal of Psychology, 64(4), 409–420. https://doi.org/10.1111/sjop.12895
  • Lazaridis, A., Vetter, M., & Pietschnig, J. (2022). Domain-specificity of Flynn effects in the CHC-model: Stratum II test score changes in Germanophone samples (1996–2018). Intelligence, 95, 101707. https://doi.org/10.1016/j.intell.2022.101707
  • Leino, K., Ahonen, A. K., Hienonen, N., Hiltunen, J., Lintuvuori, M., Lähteinen, S., Lämsä, J., Nissinen, K., Nissinen, V., & Puhakka, E. (2019). PISA 18: ensituloksia. Suomi parhaiden joukossa [PISA 18: First results. Finland among the best]. Opetus-ja kulttuuriministeriön julkaisuja, 40. https://jyx.jyu.fi/handle/123456789/75294
  • Leinonen, T., & Tandefelt, M. (2007). Evidence of language loss in progress? Mother-tongue proficiency among students in Finland and Sweden. International Journal of the Sociology of Language, 2007(187–188), 185–203. https://doi.org/10.1515/IJSL.2007.055
  • McGill, R. J., Ward, T. J., & Canivez, G. L. (2020). Use of translated and adapted versions of the WISC-V: Caveat emptor. School Psychology International, 41(3), 276–294. https://doi.org/10.1177/0143034320903790
  • Millard, S. P. (2013). EnvStats: An R package for environmental statistics. Springer. https://www.springer.com
  • Oakland, T., & Hu, S. (1991). Professionals who administer tests with children and youth: An international survey. Journal of Psychoeducational Assessment, 9(2), 108–120. https://doi.org/10.1177/073428299100900201
  • OECD. (2019). PISA 2018 results (Volume I): What students know and can do. Paris: PISA, OECD Publishing. https://doi.org/10.1787/5f07c754-en
  • Official Statistics of Finland. (2017). Educational structure of population [e-publication]. Retrieved October 28, 2022, from http://www.stat.fi/til/vkour/index_en.html
  • Official Statistics of Finland. (2019).Support for learning [e-publication]. Retrieved March 30, 2023, from https://statfin.stat.fi/PxWeb/pxweb/en/StatFin_Passiivi/StatFin_Passiivi__pop/statfinpas_pop_pxt_002_201900.px/table/tableViewLayout1/
  • Official Statistics of Finland. (2021). Population structure. Key figures on population by region, 1990-2021 [e-publication]. Retrieved November 21, 2022, from https://pxdata.stat.fi/PxWeb/pxweb/en/StatFin/StatFin__vaerak/statfin_vaerak_pxt_11ra.px/table/tableViewLayout1/
  • Pauls, F., Daseking, M., & Petermann, F. (2020). Measurement invariance across gender on the second-order five-factor model of the German Wechsler intelligence scale for children–Fifth Edition. Assessment, 27(8), 1836–1852. https://doi.org/10.1177/1073191119847762
  • Pietschnig, J., & Voracek, M. (2015). One century of global IQ gains: A formal meta-analysis of the Flynn effect (1909–2013). Perspectives on Psychological Science, 10(3), 282–306. https://doi.org/10.1177/1745691615577701
  • Platt, J. M., Keyes, K. M., McLaughlin, K. A., & Kaufman, A. S. (2019). The Flynn effect for fluid IQ may not generalize to all ages or ability levels: A population-based study of 10,000 US adolescents. Intelligence, 77, 101385. https://doi.org/10.1016/j.intell.2019.101385
  • Rabin, L. A., Paolillo, E., & Barr, W. B. (2016). Stability in test-usage practices of clinical neuropsychologists in the United States and Canada over a 10-year period: A follow-up survey of INS and NAN members. Archives of Clinical Neuropsychology, 31(3), 206–230. https://doi.org/10.1093/arclin/acw007
  • R Core Team. (2021). R: A language and environment for statistical computing. In R Foundation for Statistical Computing. https://www.R-project.org/
  • Reverte, I., Golay, P., Favez, N., Rossier, J., & Lecerf, T. (2015). Testing for multigroup invariance of the WISC-IV structure across France and Switzerland: Standard and CHC models. Learning and Individual Differences, 40, 127–133. https://doi.org/10.1016/j.lindif.2015.03.015
  • Rindermann, H. (2007). The g‐factor of international cognitive ability comparisons: The homogeneity of results in PISA, TIMSS, PIRLS and IQ‐tests across nations. European Journal of Personality: Published for the European Association of Personality Psychology, 21(5), 667–706. https://doi.org/10.1002/per.634
  • Rodríguez-Cancino, M., & Concha-Salgado, A. (2023). WISC-V measurement invariance according to sex and age: Advancing the understanding of intergroup differences in cognitive performance. Journal of Intelligence, 11(9), 180. https://doi.org/10.3390/jintelligence11090180
  • Rodriguez, C. M., Treacy, L. A., Sowerby, P. J., & Murphy, L. E. (1998). Applicability of Australian adaptations of intelligence tests in New Zealand. New Zealand Journal of Psychology, 27(1), 5.
  • Roivainen, E. (2019). European and American WAIS IV norms: Cross‐national differences in perceptual reasoning, processing speed and working memory subtest scores. Scandinavian Journal of Psychology, 60(6), 513–519. https://doi.org/10.1111/sjop.12581
  • Rosenqvist, J., Slama, S., & Haavisto, A. (2022). Användningen av kognitiva test vid psykologiska utredningar av barn och unga på svenska i Finland – en översikt [Using cognitive tests for psychological evaluations of children in Swedish in Finland - an overview]. NMI Bulletin, 32, 74–93. https://bulletin.nmi.fi/wp-content/uploads/2022/09/NMI-Bulletin_sve_2022-E-74-93-rosenqvist.pdf
  • Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02
  • Saarela, J. (2021). Finlandssvenskarna 2021: En statistisk rapport [The Finland-Swedes 2021: A statistical report] (978–952–9700–67–7). https://folktinget.fi/Site/Data/1597/Files/Finlandssvenskarna%202021_statistisk%20rapport_Folktinget_KLAR.pdf
  • Salonen, J., Slama, S., Haavisto, A., & Rosenqvist, J. (2023a). Comparison of WPPSI-IV and WISC-V cognitive profiles in 6–7-year-old Finland-Swedish children – findings from the FinSwed study. Child Neuropsychology, 29(5), 1–23. https://doi.org/10.1080/09297049.2022.2112163
  • Salonen, J., Slama, S., Haavisto, A., & Rosenqvist, J. (2023b). A comparison of WPPSI-IV performance between Finland-Swedish minority children and the Scandinavian test norms – findings from the FinSwed study. Journal of the International Neuropsychological Society, 29(10), 943–952. https://doi.org/10.1017/S1355617723000395
  • Sjöholm, K. (2004). Swedish, Finnish, English? Finland’s Swedes in a changing world. Journal of Curriculum Studies, 36(6), 637–644. https://doi.org/10.1080/0022027042000186600
  • Sundet, J. M., Barlaug, D. G., & Torjussen, T. M. (2004). The end of the Flynn effect?: A study of secular trends in mean intelligence test scores of Norwegian conscripts during half a century. Intelligence, 32(4), 349–362. https://doi.org/10.1016/S0160-2896(04)00052-2
  • Swedish National Agency for Education. (2023). PISA 2022: 15-åringars kunskaper i matematik, läsförståelse och naturvetenskap [PISA 2022: 15-year-olds’ knowledge in mathematics, reading comprehension, and science]. www.skolverket.se/publikationer
  • Trahan, L. H., Stuebing, K. K., Fletcher, J. M., & Hiscock, M. (2014). The Flynn effect: A meta-analysis. Psychological Bulletin, 140(5), 1332. https://doi.org/10.1037/a0037173
  • van de Vijver, F. J., Weiss, L. G., Saklofske, D. H., Batty, A., & Prifitera, A. (2019). A cross-cultural analysis of the WISC-V. In L. G. Weiss, D. H. Saklofske, J. A. Holdnack, & A. Prifitera (Eds.), WISC-V. Clinical use and interpretation (pp. 223–244). Academic Press.
  • Vataja, P., Lerkkanen, M.-K., Aro, M., Westerholm, J., Risberg, A.-K., & Salmi, P. (2022). The predictors of literacy skills among monolingual and bilingual Finnish–Swedish children during first grade. Scandinavian Journal of Educational Research, 66(6), 960–976. https://doi.org/10.1080/00313831.2021.1942191
  • Weber, D., Dekhtyar, S., & Herlitz, A. (2017). The Flynn effect in Europe–effects of sex and region. Intelligence, 60, 39–45. https://doi.org/10.1016/j.intell.2016.11.003
  • Wechsler, D. (2014a). Wechsler intelligence scale for children – Fifth Edition. NCS Pearson, Inc.
  • Wechsler, D. (2014b). WPPSI-IV. Svensk version [WPPSI-IV. Swedish version]. NCS Pearson, Inc.
  • Wechsler, D. (2016). WISC-V. Svensk version [WISC-V. Swedish version]. NCS Pearson Inc.
  • Wechsler, D., Raiford, S. E., & Holdnack, J. A. (2014). WISC-V. Technical and interpretive manual. NCS Pearson.
  • Weiss, L. G., Gregoire, J., & Zhu, J. (2016). Flaws in Flynn effect research with the Wechsler scales. Journal of Psychoeducational Assessment, 34(5), 411–420. https://doi.org/10.1177/0734282915621222
  • Weiss, L. G., Harris, J. G., Prifitera, A., Courville, T., Rolfhus, E., Saklofske, D. H., & Holdnack, J. A. (2006). WISC-IV interpretation in societal context. In S. D. H. Weiss, A. Prifitera, G. Lawrence, & J. A. Holdnack (Eds.), WISC-IV advanced clinical interpretation (pp. 1–57). Academic Press/Elsevier.
  • Weiss, L. G., Locke, V., Pan, T., Harris, J. G., Saklofske, D. H., & Prifitera, A. (2019). Wechsler intelligence scale for children—fifth edition: Use in societal context. In Wisc-V : Clinical use and interpretation. Elsevier Science & Technology.
  • Weiss, L. G., Saklofske, D. H., Holdnack, J. A., & Prifitera, A. (2019). WISC-V: Clinical use and interpretation. Academic Press.
  • West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables. In H. R. Hoyle (Ed.), Structial equation modeling: Concepts, issues, and applications (pp. 56–75). SAGE Publications Inc.
  • Westman, M., Korkman, M., Mickos, A., & Byring, R. (2008). Language profiles of monolingual and bilingual Finnish preschool children at risk for language impairment. International Journal of Language & Communication Disorders, 43(6), 699–711. https://doi.org/10.1080/13682820701839200
  • Yeates, K. O., & Donders, J. (2005). The WISC-IV and neuropsychological assessment. In A. Prifitera, D. H. Saklofske, & L. G. Weiss (Eds.), WISC-IV clinical use and interpretation: Scientist-practitioner perspectives (pp. 415–434). Elsevier Academic Press.