Publication Cover
Child Neuropsychology
A Journal on Normal and Abnormal Development in Childhood and Adolescence
Volume 26, 2020 - Issue 2
2,857
Views
6
CrossRef citations to date
0
Altmetric
Review Article

Measuring visual matching and short-term recognition memory with the CANTAB® Delayed Matching to Sample task in schoolchildren: Effects of demographic influences, multiple outcome measures and regression-based normative data

ORCID Icon, ORCID Icon, , ORCID Icon & ORCID Icon
Pages 189-218 | Received 18 Mar 2018, Accepted 05 Jul 2019, Published online: 22 Jul 2019

ABSTRACT

The study aims to establish demographically corrected, pediatric norms for the computerized Delayed Matching to Sample (DMS) test, a measure of “visual matching ability and short-term visual recognition memory, for non-verbalisable problems”. The DMS was administered to n = 184 children aged 5.10 to 14.5 years old. The DMS is a 4-choice recognition task of non-verbal, abstract patterns. The child has “to select, among four different choice patterns, the one that matches a complex visual pattern presented,” i.e., (the target stimulus). The DMS consists of two conditions: a) the overt condition in which the target stimulus and four choice patterns are shown simultaneously and b) the covert condition, in which the choice patterns are shown after the target pattern is covered. The DMS test provides three outcome measures: the accuracy score (i.e., the number of correct patterns selected), latency (i.e., the response speed) and the probability of making an error after an incorrect response. These outcome measures were calculated for both conditions and for both conditions combined. Results showed that demographic variables, such as age, sex, and/or level of parental education (LPE) affected scores on these outcome measures. Based on these data, demographically corrected norms were established for all outcome measures, per condition and for both conditions combined.

The Delayed Matching to Sample (DMS) task, which is part of the Cambridge Neuropsychological Test Battery (CANTAB®; Cambridge Cognition, Citation2012) is used worldwide in clinics and research. The task is believed to be a “measure of simultaneous matching ability and short-term visual recognition memory, for non-verbalisable problems” (Cambridge Cognition, Citation2019; Hammers et al., Citation2011; Smith, Need, Cirulli, Chiba-Falek, & Attix, Citation2013). DMS test results, however, are only meaningful when compared to suitable norm data for a sample of schoolchildren. In the current study, DMS normative data were prepared for children living in Ukraine. The DMS offers users multiple outcome measures: i.e., accuracy scores (which is the number of correct patterns selected), the response latency (i.e., the response speed), and the statistical probability of making an error after an incorrect response. Normative data were collected here for all DMS outcome measures. The DMS test, norming procedures, and results are discussed next.

The DMS task is a four-choice recognition task of non-verbal abstract patterns. The task consists of 20 trials. In each trial, the child has “to select, among four different choice patterns, the one that matches a complex visual pattern presented“ (i.e., the target stimulus; see for an example) (Bersani, Quartini, Zullo, & Iannitelli, Citation2016, see also Lecerf & De Ribaupierre, Citation2005; Cambridge Cognition, Citation2012). Furthermore, inhibition is needed to be able to perform well on four-choice recognition tasks, such as the DMS task (Davidson, Amso, Anderson, & Diamond, Citation2006). The DMS consists of two conditions: an overt condition (i.e., the target stimulus and the four choices are shown simultaneously) and a covert condition (i.e, the four choices are shown after the target is covered). The DMS overt condition is thought to be more a measure of “simultaneous visual matching abilities” (Mammarella, Pazzaglia, & Cornoldi, Citation2008), i.e., the target stimulus needs to be recognized, but four choices can be directly checked against the target. The DMS covert condition is thought to be associated with spatial-sequential memory processes, i.e., the covering of the target requires the maintenance of the abstract pattern in visual memory (Mammarella et al., Citation2006, Citation2008).

Figure 1. DMS on a touch screen: Simultaneous view of target and stimuli (option 3 is identical to target), and covert (delayed) target (this is not a real CANTAB® DMS sample).

Figure 1. DMS on a touch screen: Simultaneous view of target and stimuli (option 3 is identical to target), and covert (delayed) target (this is not a real CANTAB® DMS sample).

As mentioned earlier, the DMS test results are only meaningful when compared to fitting norm data (Mitrushina, Boone, Razani, & D’Elia, Citation2005, p. 3). Even though DMS normative data have been published for schoolchildren living in several countries such as America and Finland (e.g., Green et al., Citation2019; Lehto, Juujärvi, Kooistra, & Pulkkinen, Citation2003; Luciana & Nelson, Citation2002; Roque, Teixeira, Zachi, & Ventura, Citation2011; Vinţan, Palade, Cristea, Benga, & Muresanu, Citation2012) this is not the case for an Ukrainian pediatric sample. Culture affects thinking, knowledge, values, and beliefs (Ardila, Citation2006), which are shaped in social situations. Cultural variances are linked to geographical and ethnic differences. The CANTAB® norms have been mainly established in western (European and North American) societies. In line with Norbury and Sparks’ (Citation2013) reasoning, analyzing data and proposing norms for other populations contributes to refinement of current culturally influenced criteria for tests. Analyzing test scores of non-Western samples may contribute to refinement of cognitive frameworks currently used and the understanding of how to distinguish typical from atypical development. The aim of the present study was to establish a normative data range on the DMS test, using a sample of schoolchildren in Ukraine.

In addition, norm data currently available worldwide for the DMS include limitations. For one, these norm data are almost always calculated based on means and standard deviations for relevant subgroups of children separately, e.g., per two-year age band. This method of collecting norm data has significant limitations (see, e.g., Van der Elst, Hurks, Wassenberg, Meijs, & Jolles, Citation2011). For instance, multiple studies have shown that performances on numerous cognitive tests are influenced by the demographic variable “age” (Lezak, Howieson, Bigler, & Tranel, Citation2012; Diamond, Citation2002). In establishing normative data the total sample tends to be subdivided into many different subgroups based on, e.g., age. Some disadvantages of making these subgroups in the data are that test scores are only applied to a specific (age) subgroup in the sample, and children closely in age (i.e., 1 month apart) may be in different (age) subgroups. Additionally, the (age) subgroups themselves may show unusual characteristics, yet result in a norm for that (age) subgroup.

Also, so far we are only including one demographic variable in our example, i.e., age, whereas other variables such as sex and/or level of parental education are believed to influence test performance on cognitive tests, such as the DMS, as well (Strauss, Sherman, & Spreen, Citation2006). We will apply a promising method, called continuous norming here, which offers the option to include multiple demographic variables simultaneously, while establishing normative data (in line with an approach suggested by Van Breukelen & Vlaeyen, Citation2005; Van der Elst et al., Citation2011). This approach to norming (Van Breukelen & Vlaeyen, Citation2005) is a regression-based method, which allows including both continuous variables (e.g., age) and categorical variables (e.g., sex and/or level of parental education (LPE)), without creating the subgroups for each variable (Van der Elst et al., Citation2011). In doing so, the relation between age and the test score to-be normed is estimated based on all data points in the sample. This allows for more adequate estimates of age-related normative scores (Bechger, Hemker, & Maris, Citation2009). Additionally, the option of incorporating more than one demographic characteristic in regression models makes this normative method rather accurate (Van der Elst et al., Citation2011).

The demographic variables of age, sex and LPE are incorporated in this study because these have been associated with differences in cognitive test performance (Strauss et al., Citation2006), and partly with DMS scores more specifically. For one, performance on the DMS test has been shown to improve with age, i.e., from early childhood well into the middle school years (Huang, Klein, & Leung, Citation2016; Perlman, Huppert, & Luna, Citation2015). Studies including typically developing children using the DMS paradigm have shown that this test can be administered in children as young as 5 years and older and that the cognitive functions measured with the DMS develop rapidly in young children and mature in adolescence (Green et al., Citation2019; Paule et al., Citation1998; Roque et al., Citation2011; Luciana, Citation2003; Luciana & Nelson, Citation1998, Citation2002 – see below for more details on developmental trajectories of DMS scores). Further support for content validity (i.e., the test measures what is said to be measured) is found in, for instance, Green and colleagues (Citation2019) who administered the DMS in a group of 5–15 years old children from Mexico City. Green et al. (Citation2019) found in line with the above researchers that DMS scores continue to improve (linearly) into adolescence, i.e., up until age 15 years old. They found for instance, that children aged 5 produced on average approximately 60% correct solutions on the DMS test, whereas children, e.g., aged 7 scored approximately 74% correct solutions. These authors only studied “the percentage of correct solutions” (accuracy) calculated over all items, also to the best of our knowledge, the majority of the studies on age-related effects in DMS scores reported only on the total accuracy scores without the separate conditions (overt vs covert) or latencies.

Data on the relation between sex differences in performances on the DMS task, or tasks measuring constructs similar to those measured with the DMS task such as visual memory, are still inconclusive: several studies (e.g., Green et al., Citation2019; Luciana & Nelson, Citation1998) did not find any sex differences on the DMS total accuracy in children. However, León, Cimadevilla, and Tascón (Citation2014) proposed a developmental view on sex differences in visual memory in childhood. The demographic variable, “level of parental education” (LPE), has not yet been studied in relation to the DMS in schoolchildren. It is perceived as an estimate of socioeconomic status (SES; White, Citation1982) and higher levels of either have been positively linked to children’s higher academic achievement (Davis-Kean, Citation2005) and cognitive functions such as memory (Kaplan et al., Citation2001; Noble, McCandliss, & Farah, Citation2007).

This study is furthermore unique in reporting norm data on multiple outcome measures for the DMS test, as mentioned above. Traditionally, paper and pencil cognitive tasks, such as the Knox Block test, which is believed to be a measure of visual memory (Richardson, Citation2005), report only the total number of correct remembered responses (i.e., a measure of accuracy). The DMS offers the opportunity to not only include a measure of accuracy but also of, e.g., response latency (i.e., the response time from the moment the response items are shown). The few pediatric studies focusing on both accuracy and response latency on the DMS and/or on other cognitive tasks showed the relevance of collecting norm data for both. For one, Chien et al. (Citation2015) reported accuracy and response latency measures in a sample of 143 youths with ASD and controls (mean age 13 years old, sd 3.5). They found that accuracy scores, and not response latencies differed between the control and clinical group of participants on the DMS. Vice versa, Wassenberg et al. (Citation2008) found that, compared to healthy control children, response latencies were longer (i.e., they reacted slower) in children with ADHD on a cognitive task other than the DMS. However, these groups did not differ in accuracy on the same task. Thus, children with ADHD were able to perform the task accurately, but it took them more time than others, which has implications, for example, for interventions. These findings warrant a separate analysis (and separate norm data) of both accuracy and reaction time outcome measures on the DMS.

In sum, the aim of the study was to establish a demographically corrected normative data range on multiple outcome measures of the DMS test, using a sample of n = 184 school children in rural Ukraine, age 5.10 to 14.5 years old.

Methods

Participants

The sample consisted of n = 184 children enrolled in the local primary and middle schools in the Ukraine. Per school, all children were invited and 80,5% participated. Characteristics of this sample are summarized in (there were approximately equal numbers at different points across the age range except for children below 5.9 and above 14 years old). The level of parental education was based on the level of education completed by the parent(s) in a household; low described education up till grade 9 (i.e., primary and middle school) and high applied to grade 10 (high school) and above, in line with the Ukrainian educational system (Ukraine Channel, Citation2017). In the few cases of a discrepancy between parent’s (or caregiver’s) level of education (6.5%), the highest level was allocated as the mean for the household (in line with, e.g., Jiang, Ekono, & Skinner, Citation2016). Parental level of education and the kind of schooling they could afford in the Ukraine has been linked to children’s test scores (Ardila, Rosselli, Matute, & Guajardo, Citation2005; Jiang et al., Citation2016). For the purpose of representativeness of the study sample, a similar number of children was included per age group. For all age groups we succeeded, except for the oldest and youngest age groups 5.5–5.9 years old and 14.5–14.9 years old. The sample sizes of these latter groups were unfortunately smaller in size, however, note that age was included as a continuous variable ranging from 5.10 to 14.5 years old in the analyses.

Table 1. Basic demographic data.

Procedure and instrument

The researchers approached schools to gauge interest for participation in the study, after which the school management invited parents to the information meetings about it. Information and consent letters were provided by the researchers and school management and then taken home. The testing started once parents and children had given informed consent letters to their teachers. The research ethics committee of the Faculty of Psychology and Neuroscience of Maastricht University, the Netherlands, approved this study. All data were obtained in compliance with the ethics regulations of the WMA declaration of Helsinki. Accordingly, debriefing was provided in a personal report to each student (parent/guardian) at the individual level, and anonymous class- and school reports were provided to school management. Testing took place during school hours without a need for compensation.

The Delayed Matching to Sample (DMS, Cambridge Cognition, Citation2012), was administered individually in a separate room on an HP Pavilion TS sleek-book laptop with a 15-in. touch screen by two certified researchers. Standard administration procedures prescribed by CANTAB® were followed. The test included both practice trials and test trials in which the child had to indicate, from four choices, the one pattern that is identical to the target. Per trial, the child received direct feedback (i.e., correct/incorrect response). Per trial, the four choices were either shown simultaneously (i.e., the overt condition, n = 5 trials) or after the target was covert (i.e., the covert condition, n = 15 trials) (i.e., the practice trials were additional). During the practice trials, the child is asked to perform trials from both conditions starting with an overt and followed by two covert trials with varying delays. The 20 trials (excluding the practice trials) are randomized and fixed. Touching the screen anywhere except for the correct response item is not recorded. Administration time is about 8 min (Cambridge Cognition, Citation2012).

The outcome measures of the DMS test can be divided in three groups; accuracy, response latency, and probability of an error after an incorrect response. The accuracy scores for the DMS (i.e., the total amount of correct responses in all conditions on the first attempt, reported here as DMS accuracy), were reported for the combined conditions (total) as well as per condition, i.e., the overt and covert conditions separately. Response latency was measured in milliseconds (ms) and refers to the time from the appearance of response items to the selection of the pattern that is identical to the target pattern. Mean latencies are reported over all times and for the overt and covert conditions separately. The probability of making an error after an incorrect response indicates participant’s sensitivity to negative feedback and is not reported per condition (given that there are not sufficient trials). This measure is based on signal detection theory (SDT; Goldstein, Citation2002) and ranges from 0 (highest) to 1 (i.e., an increased chance of making errors). In the DMS the participant received feedback (incorrect response results in a red line around the response item) and is immediately after completing the trial required to process the next trial (target stimulus). Based on Bayes’ theorem this probability is calculated by adding failing on problem 1 and subsequently on problem 2, which is divided by the total amount of errors made by the participant (Elliott et al., Citation1996). An optimal response is thought to reflect the participant’s sensitivity to the stimulus, i.e., identifying the correct response item after presentation of the target irrespective of noise (among others a previously made incorrect response; Goldstein, Citation2002). A high score indicates a high tendency to make an error after an error (Cambridge Cognition, Citation2012) and has been associated with, e.g., depression (Elliott et al., Citation1996), which once more supports the need to propose suitable norms for a sample.

Statistical analyses

The exploratory analysis started with providing an overview of the descriptive data, i.e., the means and standard deviations for the main outcome measures (). Next Pearson correlations among all the DMS outcome measures (including the total scores and those calculated for the overt and covert conditions separately) were computed (see ). Furthermore, a summary independent sample t-test was used to compare the observed means of the DMS accuracy, response latency and probability of an error after an incorrect scores of our Ukrainian sample to the western CANTAB® mean child scores (CANTAB® child norms, Cambridge Cognition, Citation2014). This comparison was based on the means of the total scores of children 6–13 years old in the respective samples (i.e., the ages that overlapped between the two samples). This summary independent sample t-test is based on the overall means of the respective age groups of each sample and the mean of the standard deviations (Field, Citation2009).

Next, the effects of the demographic variables on these DMS scores were further analyzed with multiple linear regression analyses. The full regression models included age, age2, sex and LPE and all two-way interactions as predictors. Age was centered (Age_C = calendar age in months – mean age of the sample, which equaled 115.36 months) before computing the quadratic age variable to avoid multi-collinearity (Van der Elst, van Boxtel, van Breukelen, & Jolles, Citation2005). Sex was dummy coded as 0 = female and 1 = male. LPE was dummy coded as 0 = low and 1 = high. Next to the total (i.e., the overt and covert conditions combined) DMS scores for accuracy and response latencies, we analyzed the DMS scores on the overt and covert conditions separately. DMS scores on the covert condition were analyzed using multiple linear regression analyses (similar to the procedures mentioned for the other DMS scores). Scores on the DMS accuracy overt (simultaneous) condition were analyzed using an ordinal logistic regression analysis with cumulative probability due to the restriction in the range of possible scores (0–5 items; Laerd Statistics, Citation2015). The same demographic variables as for the other DMS analyses were used.

All final regression models were achieved in a step-down hierarchical procedure by excluding the non-significant predictors from the model. The assumptions of regression analysis were tested separately per model. Importantly, regressions require a normal distribution of the residuals only (Field, Citation2009) and not for the raw test scores. Meeting these assumptions of the normal distribution of the residuals was therefore tested. The normal distribution of the residuals was tested with Kolmogorov–Smirnov tests on the residual values. The predicted values were grouped in quartiles. The Levene test of the standardized residuals based on these quartile groups tested for homoscedasticity (Van der Elst et al., Citation2011). Multi-collinearity was assessed using the Variance Inflation Factors (VIF, which should be below 10), and influential cases by calculating Cook’s distances (Fisher et al., Citation2014).

Next, normative data, based on the final regression models (i.e., accuracy, overt and covert conditions, respectively, including response latencies), were calculated using four steps (Van Breukelen & Vlaeyen, Citation2005; Van der Elst et al., Citation2005). First, the expected test scores were computed applying the regression model (= B0 + B1X1 + … + BnXn, with B0 = the intercept, B1, … Bn the regression weights for the demographic variables and X1, … Xn the values of the demographic variables). Then, the residuals were calculated (= observed score – expected score). In the next step, the residuals were standardized using the standard deviation (SD) of the residuals in the normative sample (= residual/SD (residual) of the normative sample) (Van der Elst et al., Citation2011). In the last step, the residuals were converted into percentile values following the standard normal distribution if the assumption for normality of the standardized residuals was met in the normative sample. The Appendix shows these converted scores. An alpha level of .01 was applied to avoid Type 1 errors due to multiplicity. All calculations were carried out in SPSS version 24. The norm calculations for DMS accuracy overt followed the described procedure but are limited due to the range of 0–5 (results and assumptions reported below). The overt accuracy scores therefore allowed comparison to the covert and total accuracy but likely show ceiling effects in observed and thus predicted scores, which are summarized in a cumulative norm table (see Appendix).

Results

The means and standard deviations for all DMS outcome measures were calculated for the Ukrainian sample ().

Table 2. Mean and standard deviation of all DMS outcome measures.

All children completed the practice trials of the DMS successfully. The summary independent sample t-test comparing the overall means of the respective age group means (6–13 years old) of our Ukrainian sample (n = 156, mean accuracy 13.98; mean latency 4241.02, mean probability of error n = 153, 0.27) to the western CANTAB® age groups (n = 72, mean accuracy 14.84, mean latency 4107.25, mean probability of error n = 69, 0.21) appeared to show no significant difference between the two samples. The DMS outcome measures were based on the total mean scores (DMS Accuracy t(53.5) = −0.31, p = 0.76; DMS response latency t(53.5) = 0.09, p = 0.93; DMS probability of an error after an incorrect response t(53.25) = −0.37, p = 0.71) and overlapping 95% confidence intervals. show the age-based group means of the Ukrainian and CANTAB® sample (including the overt and covert conditions separately for accuracy and response latencies).

Figure 2. Observed scores for the mean DMS accuracy (i.e., total, overt and covert conditions) comparing the Ukrainian sample to the CANTAB® traditional mean norms (2-year age groups).

Figure 2. Observed scores for the mean DMS accuracy (i.e., total, overt and covert conditions) comparing the Ukrainian sample to the CANTAB® traditional mean norms (2-year age groups).

Figure 3. Observed scores for the mean DMS response latency for accuracy scores (i.e., total correct, overt correct and covert correct conditions), comparing the Ukrainian sample to the CANTAB® traditional mean norms (2-year age groups).

Figure 3. Observed scores for the mean DMS response latency for accuracy scores (i.e., total correct, overt correct and covert correct conditions), comparing the Ukrainian sample to the CANTAB® traditional mean norms (2-year age groups).

Figure 4. Observed scores for the mean DMS probability of error after an incorrect response comparing the Ukrainian sample to the CANTAB® mean norms (2-year age groups).

Figure 4. Observed scores for the mean DMS probability of error after an incorrect response comparing the Ukrainian sample to the CANTAB® mean norms (2-year age groups).

shows the correlations between the DMS outcome measures. In principle, correlations indicated that different latencies (total, overt and covert) correlated among each other. The same holds for the different accuracy measures. In contrast, response latencies and accuracy showed no meaningful relationship. Probability of an error after an incorrect response only correlated with accuracy measures.

Table 3. Pearson zero order correlations for DMS accuracy, response latency, and probability of an error after an error variables.

Results of the multiple regressions for the three outcome measures per DMS condition

Next, the final multiple linear regression models that were significant are shown in . Power transformations, using the Box-Cox transformations, which ensured consistency across variables, were applied (i.e., the DMS accuracy total, DMS response latency overt and covert conditions), because preliminary analyses suggested heteroscedasticity with the untransformed scores (Osborne, Citation2010). The other outcome measures, i.e., the DMS accuracy overt and covert conditions, DMS response latency (for accuracy in total), and probability of an error after an incorrect response, were not transformed. Also, all DMS response latency scores displayed outliers (residual SD > 3 of the mean and thus outside the normal distribution), which were removed. After these transformations and removal of outliers, the assumptions of multiple linear regression analyses were met for the final models of the DMS accuracy and latency, i.e. Kolmogorov-Smirnov values p ≥ .02. For the probability of an error score, the Kolmogorov-Smirnov value was p ≤ .001 and therefore the normality assumption was violated. Following Van der Elst, Van Boxtel, Van Breukelen and Jolles (Citation2008), this was accounted for in the construction of the norms for the probability of an error after an incorrect response scores by using the empirical distribution of the standardized residuals rather than the theoretical standard normal distribution. The other assumptions of the multiple regression models were met for the final models of all scores, i.e., all values of the Levene’s statistic were p ≥ .02; Cook’s distance values < .01; all Variance Inflation Factors ≤ 1.1. None of the interaction terms reached significance. Also, the raw scores showed that 2.72% of the children achieved the maximum score on the DMS accuracy and 3.80% of the sample made only one mistake.

Table 4. Final regression models for the DMS accuracy (all conditions), response latency and probability of an error after an incorrect response.

Results of the logistic regression for DMS accuracy in the overt condition

A logistic regression with proportional odds assumption was run to determine the effect of age, age2, sex and LPE on the DMS accuracy overt condition. Two children scored 0 and one child scored 1, so these scores were incorporated in the two-correct category, resulting in one group of 10 participants (for the lowest scores) for this analysis. Sex and LPE were not significant in this analysis. Proportional odds were assessed and a full likelihood ratio test comparing the fitted model (including the demographic variable age) to a model with varying location parameters was below the significance threshold, χ2(2) = 6.36, p = .04. As this p value for this model was below .05 separate binomial regressions were run for each category (= 5 correct, 4 correct, 3 correct, and 0,1,2 correct) showing similar odd ratios for B for age (range .97 to 1.02), thus meeting the assumption of proportional odds. The deviance goodness-of-fit test showed that the logistic regression with proportional odds was a good fit to the observed data, χ2(260) = 172.49, p = .66, but 64.8% of cells showed zero frequencies, which may indicate limitations in the model for predicting scores. Nonetheless, the final model fitted the data significantly better compared to an intercept-only model, χ2(1) = 9.56, p < .00. The high zero cell frequently may be linked to the continuous nature of the independent variable (age), i.e., a characteristic of the analysis and thus of limited concern while the noted significant of the model appears in line with the observed data (Laerd Statistics, Citation2015). The odds of scoring 4 correct in the DMS accuracy overt condition were 12.95, 95% CI [7.44, 22.56], χ2(1) = 81.89, p = .00. An increase in age (centered and expressed in months) was associated with an increase in the odds of DMS accuracy overt, with an odds ratio of 0.98, 95% CI [0.97, 0.99], χ2(1) = 8.84, p = .00 (see also & ).

Figure 5. Expected test scores based on the regression models for the DMS accuracy (total correct), accuracy overt (simultaneous), accuracy covert (all delay) conditions with maximum scores of 20, 5 and 15, respectively. All scores were stratified by age, but the accuracy (total) and accuracy covert (all delay) conditions are shown for higher level of parental education (LPE).

Figure 5. Expected test scores based on the regression models for the DMS accuracy (total correct), accuracy overt (simultaneous), accuracy covert (all delay) conditions with maximum scores of 20, 5 and 15, respectively. All scores were stratified by age, but the accuracy (total) and accuracy covert (all delay) conditions are shown for higher level of parental education (LPE).

Normative procedure for all DMS outcome measures

Norms for DMS outcome measures (accuracy, response latency and probability of an error after an incorrect response outcome measures; see Appendix) are established by applying the four-step process described above. Next are an example of this process and accompanying interpretation. For example, suppose that a 6 years old child, whose parents have a low LPE, scored 10 points on DMS accuracy. As detailed above, the regression model for the DMS accuracy score is a power transformation of the test score to achieve better agreement with the distributional assumptions of regression models. Here, the transformed DMS accuracy score equals 4 (Lambda 0.5 applied to 10; Osborne, Citation2010). The first step of the normative procedure is to calculate the expected score for this child with the regression model presented in . This is constant + 0.01 * (age child in months – average age sample) + 0.42 * LPE child (= 4.93 + [0.01 * (72–115.59)] + (0.42 * 0) = 4.49). The residual is calculated, which is −0.49 (= 4–4.49). In the third step, the residual is standardized −0.65 (−0.49/0.75). The standardized residual is converted into a percentile value based on the standard normal cumulative distribution. A standardized residual of −0.65 corresponds with a percentile value of .44. This means that 44% of the population of 6 years old children, whose parents have a low LPE obtain a DMS accuracy score that is 10 or lower. The DMS total accuracy test score of this child is therefore within normal limits.

The DMS probability of an error after an incorrect response outcome measure did not meet the normality assumption as evaluated using the Kolmogorov–Smirnov test. Norms for this outcome were, therefore, based on the empirical distribution of the standardized residuals (Zhou, Citation1998). Lastly, this outcome measure showed an effect for Age-C2, which indicates a curved development with age, as can be seen in .

Figure 6. Expected standardized scores based on the regression models for the DMS accuracy (total) and DMS probability of error of an incorrect response.

Figure 6. Expected standardized scores based on the regression models for the DMS accuracy (total) and DMS probability of error of an incorrect response.

The observed mean of 4.5 showed the expected ceiling effect for accuracy in the overt condition (max. 5). Norms for this outcome measure therefore have limited value and in line with observed scores, is the prediction that most children score 3, 4, or 5 correct. Therefore, a table with approximate cumulative percentages for the overt scores stratified by age is in the Appendix.

Discussion

The main aim of the present study was to establish the normal range of performance on three outcome measures for the Delayed Matching to Sample (DMS) test for schoolchildren in the Ukraine. A more novel method called continuous norming was used, which allowed for the inclusion of the demographic variables age, sex, and LPE simultaneously (Van Breukelen & Vlaeyen, Citation2005; Van der Elst et al., Citation2011). The normal range of performance on the DMS accuracy, latency and probability of an error after an incorrect response outcome measures was reported based on the final regressions (, and ) and expected scores were presented in the Appendix. The results of the study are discussed next.

First, the final regression models in this method of continuous norming showed that age influenced DMS accuracy (all conditions), response latency in the overt (simultaneous) condition, and the probability of an error after an incorrect response (see ). In principle age was linearly related to DMS outcome measures, i.e., children appeared to become more accurate or faster with increasing age. Older children made fewer mistakes, appeared less likely to make an error after an incorrect response, and responded faster than younger children in the overt condition. However, on two DMS outcome measures (i.e., accuracy in the covert condition and probability of an error after an incorrect response) we found a curvilinear age effect. This indicates that development was faster or slower at certain ages compared to other ages. However, specific age groups were not tested on significance of these differences and are not discussed. The observed effects appear nevertheless in line with studies on cross-sectional age-related individual differences in performances of simultaneous visual matching ability, inhibition of responses (i.e., distractors in shape and color) and short-term visual recognition memory of non-verbalisable patterns that are all thought to increase with age (Alloway, Gathercole, Willis, & Adams, Citation2004; Davidson et al., Citation2006; Gathercole, Pickering, Ambridge, & Wearing, Citation2004; Korkman, Kemp, & Kirk, Citation2001; Luciana & Nelson, Citation1998, Citation2002).

The second finding of the current study on the effects of demographic variables on DMS test performance varied for sex. For one, sex differences were found for DMS response latency in all conditions. Boys responded faster without making more (or fewer) mistakes than girls. This is in line with others who did not find sex differences on the DMS accuracy (Green et al., Citation2019; Luciana & Nelson, Citation1998). Yet, we reported not only on accuracy but on response latency as well, which explains the differences in findings, i.e., an effect of sex on response latency. However, sex explained only 9%, 6% and 5% of the variance in the DMS response latency for total accuracy, the overt (simultaneous) and covert (all delays) conditions, respectively. This is a weak effect. Other explorations for the sex difference in children in this DMS task may need to involve personality characteristics (temperament) or cultural sex differences to explain these findings. It would be conceivable that girls would, for example, have a higher tendency to (double) check their responses before actually responding in line with studies on sex differences in personality traits (Cloninger, Svrakic, & Przybeck, Citation1993; De Bolle et al., Citation2015).

Third, children’s scores of parents with a higher level of education (LPE) were associated with more accuracy on the DMS than those with lower LPE scores. LPE contributed to predicting 27% and 30% of the variance in DMS accuracy, and the accuracy covert (delay) condition, respectively. The DMS accuracy overt condition did not show an LPE effect. This may support theories that the overt (simultaneous) condition relies on different processes compared to the covert (delayed) i.e., simultaneous visual matching versus visual short-term recognition memory (Korkman et al., Citation2001; Mammarella et al., Citation2008). Equally, no LPE-related differences were found for all DMS response latencies. This seems to be line with Woods, Wyma, Yund, Herron, and Reed (Citation2015) found in their research in a sample of adults that response latency was not affected by the level of education of participants themselves. This supports our contention that when studying inter-individual differences in cognitive functions, and more specifically in cognitive functions needed for DMS performance in individuals (children and adults) both response latency and accuracy need to be assessed in the context of collecting norms. As opposed to traditional norming, the proposed norms for this sample are not all, and not only based on one demographic variable such as age but on the demographic predictors that were significant for the specific DMS outcome measure (i.e., for the DMS accuracy (total) and accuracy covert scores, probability of an error after an incorrect response, and the DMS response latency overt condition, on more than one variable).

In this study, we aimed to reveal the normal range for accuracy and response latency for healthy children taking demographic variables into account. Delineating these different components in a complex cognitive task, i.e., accuracy versus response latency, and their respective significant predictors may inform approaches to information processing required for learning in classrooms. The use of both outcome measures could be incorporated in feedback. There is, for example, increasing attention for formative feedback in classrooms. Instead of a single focus on accuracy (usually summative feedback in the form of amount correct for an end product), teachers are encouraged to formulate and deliver feedback that invites learners to engage (Havnes, Smith, Dysthe, & Ludvigsen, Citation2012). Being able to highlight more than one aspect both in learning processes and end results may motivate students. Devising more refined norms may provide both teachers and students with constructive developmental information.

Finally, we posed that continuous norming provides more accurate estimates of children’s expected scores and therefore included demographic variables (Mitrushina et al., Citation2005). The secondary analyses indicated non-significant differences between the age group means for the Ukrainian schoolchildren and the CANTAB® standardized norm on all three DMS outcome measures (also shown in ). This may imply little cultural differences on these DMS outcome measures. However, this comparison was based on the total of the means and standard deviations of age sub groups in the western norm sample and thus subject to the limitations discussed above. Regression analyses as carried out in this study (i.e., applying the demographic variables age, sex and LPE applied to each data point in the western norm sample as well) would provide more credence to a comparison between both samples, because it would be a more detailed and accurate analysis (Bechger et al., Citation2009). Such a comparison might more accurately evaluate whether cultural influences on these outcome measures are (non-)significant.

There are some limitations to this study. The sample consists of primarily rural schools, which may reduce generalizability of these findings. One third of the Ukrainian population is rural (The World Bank, Citation2016). The economic infrastructure in the Ukraine has been limitedly modernized since independence in 1991 and national and regional capitals may receive more funds compared to rural areas (Mokrushyna, Citation2015). This may support the notion that rural schools in different geographical locations might be in similar (socio-economic) circumstances that warrant generalizability. Caution might be warranted to application of these findings to urban areas. Children of a similar lower SES in urban areas may, for instance, have more chances to access computers through better-equipped public facilities, which in turn may lead to more familiarity with computerized tasks resulting in differences in outcomes (Ardila, Citation1995; Mokrushyna, Citation2015). Fazeli, Ross, Vance, and Ball (Citation2013) however did not find differences in cognitive test performance between experienced and non-experienced computer users in an aging study. The latter is supported in the study results here in the absence of significant differences between the age group means of this sample and the CANTAB® standardized norms.

Another limitation concerned sample characteristics. The lower level LPE group was significantly smaller than the high-level LPE group. In this method of multiple regression analyses, each data point is applied to the whole sample (Van Breukelen & Vlaeyen, Citation2005). The earlier presented argument about the use of continuous and dichotomous variables in these regressions simultaneously applied to the whole sample also concerns the size of the low LPE group and/or a slightly smaller age group of, e.g., older children: fewer data points are needed to arrive at statistically valid results (Van Breukelen & Vlaeyen, Citation2005). The advantages of regression-based normative methods comes at the cost that thorough checking of the assumptions of these complex models is required. Assumptions of homoscedasticity and normality of the standardized residuals have been thoroughly checked when making these regression-based norms. The standardized residuals of the probability of an error scores were non-Gaussian, and this was accounted for by using the empirical distribution of the standardized residuals.

Final conclusions

This study aimed to establish a normative range for multiple outcome measures of the DMS visual-spatial memory task. We found support to derive norms based on age, sex and/or LPE, depending on the DMS outcome measure included. Especially (but not solely) DMS accuracy showed the influence of more than one demographic variable per outcome measure in contrast to more traditionally established norms based on one demographic variable (Cambridge Cognition, Citation2012). New norms for children in rural Ukrainian schools are proposed in the Appendix.

Disclosure statement

No potential conflict of interest was reported by the authors.

References

  • Alloway, T. P., Gathercole, S. E., Willis, C., & Adams, A. M. (2004). A structural analysis of working memory and related cognitive skills in young children. Journal of Experimental Child Psychology, 87(2), 85–106.
  • Ardila, A. (1995). Directions of research in cross-cultural neuropsychology. Journal of Clinical and Experimental Neuropsychology, 17(1), 143–150.
  • Ardila, A. (2006). Cultural values underlying psychometric cognitive testing . Neuropsychology Review, 15(4), 185.
  • Ardila, A., Rosselli, M., Matute, E., & Guajardo, S. (2005). The influence of the parents’ educational level on the development of executive functions. Developmental Neuropsychology, 28(1), 539–560.
  • Bechger, T., Hemker, B., & Maris, G. (2009). Retrieved from
  • Bersani, G., Quartini, A., Zullo, D., & Iannitelli, A. (2016). Potential neuroprotective effect of lithium in bipolar patients evaluated by neuropsychological assessment: Preliminary results. Human Psychopharmacology: Clinical and Experimental, 31(1), 19–28.
  • Cambridge Cognition. (2012). Cambridge neuropsychological test automated battery (CANTABeclipse®) manual. Cambridge: Author.
  • Cambridge Cognition. (2014). Child norms Cambridge neuropsychological test automated battery (manual). Cambridge: Author.
  • Cambridge Cognition. (2019). Delayed matching to sample (DMS). Retrieved from https://www.cambridgecognition.com/
  • Chien, Y. L., Gau, S. S., Shang, C. Y., Chiu, Y. N., Tsai, W. C., & Wu, Y. Y. (2015). Visual memory and sustained attention impairment in youths with autism spectrum disorders. Psychological Medicine, 45(11), 2263–2273.
  • Cloninger, C., Svrakic, D. M., & Przybeck, T. R. (1993). A psychobiological model of temperament and character. Archives of General Psychiatry, 50(12), 975–990.
  • Davidson, M. C., Amso, D., Anderson, L. C., & Diamond, A. (2006). Development of cognitive control and executive functions from 4 to 13 years: Evidence from manipulations of memory, inhibition, and task switching. Neuropsychologia, 44(11), 2037–2078.
  • Davis-Kean, P. E. (2005). The influence of parent education and family income on child achievement: The indirect role of parental expectations and the home environment. Journal of Family Psychology, 19(2), 294.
  • De Bolle, M., De Fruyt, F., McCrae, R. R., Löckenhoff, C. E., Costa, P. T., Aguilar-Vafaie, M. E., … Terracciano, A. (2015). The emergence of sex differences in personality traits in early adolescence: A cross-sectional, cross-cultural study. Journal of Personality and Social Psychology, 108(1), 171–185.
  • Diamond, A. (2002). Normal development of prefrontal cortex from birth to young adulthood: Cognitive functions, anatomy, and biochemistry. In D. Stuss and R. Knight  (eds.), Principles of frontal lobe function (pp. 466–503). New York, NY: Oxford University Press.
  • Elliott, R., Sahakian, B. J., McKay, A. P., Herrod, J. J., Robbins, T. W., & Paykel, E. S. (1996). Neuropsychological impairments in unipolar depression: The influence of perceived failure on subsequent performance. Psychological Medicine, 26(5), 975–989.
  • Fazeli, P. L., Ross, L. A., Vance, D. E., & Ball, K. (2013). The relationship between computer experience and computerized cognitive test performance among older adults. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 68(3), 337–346.
  • Field, A. (2009). Discovering statistics using SPSS. Thousand Oaks, CA: Sage publications.
  • Fisher, S. D., Gray, J. P., Black, M. J., Davies, J. R., Bednark, J. G., Redgrave, P., … Reynolds, J. N. (2014). A behavioral task for investigating action discovery, selection and switching: Comparison between types of reinforcer. Frontiers in Behavioral Neuroscience, 8, 398.
  • Gathercole, S. E., Pickering, S. J., Ambridge, B., & Wearing, H. (2004). The structure of working memory from 4 to 15 years of age. Developmental Psychology, 40(2), 177.
  • Goldstein, E. B. (2002). Appendix A: Signal detection: Procedure and theory. Sensation and perception (6 ed., pp. 583–590). New York, NY: Cengage Learning.
  • Green, R., Till, C., Al-Hakeem, H., Cribbie, R., Téllez-Rojo, M. M., Osorio, E., … Schnaas, L. (2019). Assessment of neuropsychological performance in Mexico City youth using the Cambridge Neuropsychological Test Automated Battery (CANTAB). Journal of Clinical and Experimental Neuropsychology, 41(3), 246–256.
  • Hammers, D., Michalski, L., Reese, E., Persad, C., Wilson, S., Powells, D., … Giordani, B. (2011). Using the CANTAB computerized battery to discriminate mild cognitive impairment and dementias. Alzheimer’s & Dementia, 7(4), S537.
  • Havnes, A., Smith, K., Dysthe, O., & Ludvigsen, K. (2012). Formative assessment and feedback: Making learning visible. Studies in Educational Evaluation, 38(1), 21–27.
  • Huang, A. S., Klein, D. N., & Leung, H. C. (2016). Load-related brain activation predicts spatial working memory performance in youth aged 9–12 and is associated with executive function at earlier ages. Developmental Cognitive Neuroscience, 17, 1–9.
  • Jiang, Y., Ekono, M., & Skinner, C. (2016). Basic facts about low-income children: Children aged 6 through 11 years, 2014. National Center for Children in Poverty, Columbia University Mailman School of Public Health. Retrieved from http://www.nccp.org/publications/pub_1146.html
  • Kaplan, G. A., Turrell, G., Lynch, J. W., Everson, S. A., Helkala, E.-L., & Salonen, J. T. (2001). Childhood socioeconomic position and cognitive function in adulthood. International Journal of Epidemiology, 30(2), 256–263.
  • Korkman, M., Kemp, S. L., & Kirk, U. (2001). Effects of age on neurocognitive measures of children ages 5 to 12: A cross-sectional study on 800 children from the United States. Developmental Neuropsychology, 20(1), 331–354.
  • Lecerf, T., & De Ribaupierre, A. (2005). Recognition in a visuospatial memory task: The effect of presentation. European Journal of Cognitive Psychology, 17(1), 47–75.
  • Lehto, J. E., Juujärvi, P., Kooistra, L., & Pulkkinen, L. (2003). Dimensions of executive functioning: Evidence from children. British Journal of Developmental Psychology, 21(1), 59–80.
  • León, I., Cimadevilla, J. M., & Tascón, L. (2014). Developmental gender differences in children in a virtual spatial memory task. Neuropsychology, 28(4), 485–495.
  • Lezak, M. D., Howieson, D. B., Bigler, E. D., & Tranel, D. (2012). Neuropsychological assessment. New York, USA: Oxford University Press.
  • Luciana, M. (2003). Practitioner review: Computerized assessment of neuropsychological function in children: Clinical and research applications of the Cambridge Neuropsychological Testing Automated Battery (CANTAB). Journal of Child Psychology and Psychiatry, 44(5), 649–663.
  • Luciana, M., & Nelson, C. (2002). Assessment of neuropsychological function through use of the Cambridge Neuropsychological Testing Automated Battery: Performance in 4-to 12-year-old children. Developmental Neuropsychology, 22(3), 595–624.
  • Luciana, M., & Nelson, C. A. (1998). The functional emergence of prefrontally-guided working memory systems in four-to eight-year-old children. Neuropsychologia, 36(3), 273–293.
  • Mammarella, I. C., Cornoldi, C., Pazzaglia, F., Toso, C., Grimoldi, M., & Vio, C. (2006). Evidence for a double dissociation between spatial-simultaneous and spatial-sequential working memory in visuospatial (nonverbal) learning disabled children. Brain and Cognition, 62(1), 58–67.
  • Mammarella, I. C., Pazzaglia, F., & Cornoldi, C. (2008). Evidence for different components in children’s visuospatial working memory. British Journal of Developmental Psychology, 26(3), 337–355.
  • Mitrushina, M., Boone, K. B., Razani, J., & D’Elia, L. F. (2005). Handbook of normative data for neuropsychological assessment. New York, NY: Oxford University Press.
  • Mokrushyna, H. (2015). Decentralization reform in Ukraine. Retrieved from http://www.counterpunch.org/2015/08/28/decentralization-reform-in-ukraine/
  • Noble, K. G., McCandliss, B. D., & Farah, M. J. (2007). Socioeconomic gradients predict individual differences in neurocognitive abilities. Developmental Science, 10(4), 464–480.
  • Norbury, C. F., & Sparks, A. (2013). Difference or disorder? Cultural issues in understanding neurodevelopmental disorders. Developmental Psychology, 49(1), 45.
  • Osborne, J. W. (2010). Improving your data transformations: Applying the Box-Cox transformation. Practical Assessment, Research & Evaluation, 15(12), 1–9.
  • Paule, M. G., Bushnell, P. J., Maurissen, J. P., Wenger, G. R., Buccafusco, J. J., Chelonis, J. J., & Elliott, R. (1998). Symposium overview: The use of delayed matching-to-sample procedures in studies of short-term memory in animals and humans. Neurotoxicology and Teratology, 20(5), 493–502.
  • Perlman, S. B., Huppert, T. J., & Luna, B. (2015). Functional near-infrared spectroscopy evidence for development of prefrontal engagement in working memory in early through middle childhood. Cerebral Cortex, 26(6), 2790–2799.
  • Richardson, J. T. E. (2005). Knox’s cube imitation test: A historical review and an experimental analysis. Brain and Cognition, 59(2), 183–213.
  • Roque, D. T., Teixeira, R. A. A., Zachi, E. C., & Ventura, D. F. (2011). The use of the Cambridge Neuropsychological Test Automated Battery (CANTAB) in neuropsychological assessment: Application in Brazilian research with control children and adults with neurological disorders. Psychology & Neuroscience, 4(2), 255–265.
  • Smith, P. J., Need, A. C., Cirulli, E. T., Chiba-Falek, O., & Attix, D. K. (2013). A comparison of the Cambridge Automated Neuropsychological Test Battery (CANTAB) with traditional” neuropsychological testing instruments. Journal of Clinical and Experimental Neuropsychology, 35(3), 319–328.
  • Statistics, L. (2015). Statistical tutorials and software guides. Retrieved from https://statistics.laerd.com/
  • Strauss, E., Sherman, E. M., & Spreen, O. (2006). A compendium of neuropsychological tests: Administration, norms, and commentary. New York, NY: Oxford University Press.
  • The World Bank. (2016). Rural population (% of total population) by country. Retrieved from https://data.worldbank.org/indicator/SP.RUR.TOTL.ZS?locations=UA
  • Ukraine Channel. (2017). Retrieved from http://www.ukraine.com/education/
  • Van Breukelen, G. J. P., & Vlaeyen, J. W. S. (2005). Norming clinical questionnaires with multiple regression: The Pain Cognition List. Psychological Assessment, 17(3), 336–344.
  • Van der Elst, W., Hurks, P., Wassenberg, R., Meijs, C., & Jolles, J. (2011). Animal Verbal Fluency and Design Fluency in school-aged children: Effects of age, sex, and mean level of parental education, and regression-based normative data. Journal of Clinical and Experimental Neuropsychology, 33(9), 1005–1015.
  • Van der Elst, W., van Boxtel, M. P., van Breukelen, G. J., & Jolles, J. (2005). Rey’s verbal learning test: Normative data for 1855 healthy participants aged 24-81 years and the influence of age, sex, education, and mode of presentation. Journal of the International Neuropsychological Society : JINS, 11(3), 290–302.
  • Van der Elst, W., Van Boxtel, M. P., Van Breukelen, G. J., & Jolles, J. (2008). Detecting the significance of changes in performance on the Stroop Color-Word Test, Rey's Verbal Learning Test, and the Letter Digit Substitution Test: the regression-based change approach. Journal Of The International Neuropsychological Society, 14(1), 71–80.
  • Vinţan, M. A., Palade, S., Cristea, A., Benga, I., & Muresanu, D. F. (2012). A neuropsychological assessment, using computerized battery tests (CANTAB), in children with benign rolandic epilepsy before AED therapy. Journal of Medicine and Life, 5(1), 114.
  • Wassenberg, R., Hendriksen, J. G., Hurks, P. P., Feron, F. J., Vles, J. S., & Jolles, J. (2008). Speed of language comprehension is impaired in ADHD. Journal of Attention Disorders, 13(4), 374–385.
  • White, K. R. (1982). The relation between socioeconomic status and academic achievement. Psychological Bulletin, 91(3), 461.
  • Woods, D. L., Wyma, J. M., Yund, E. W., Herron, T. J., & Reed, B. (2015). Factors influencing the latency of simple reaction time. Frontiers in Human Neuroscience, 9, 131.
  • Zhou, M. (1998). Empirical distributions. Retrieved from http://www.ms.uky.edu/%7Emai/java/stat/EmpDis.html

Appendix

Table A1. Normative data for the DMS Accuracy (Total Correct) score, stratified by age and level of parental education (LPE).

Table A2. Normative data for the DMS Accuracy Covert (All Delays) score, stratified by age and level of parental education (LPE).

Table A3. Cumulative percentages for the DMS Accuracy Overt (Simultaneous) score stratified by age from 4 correct and lower (as the rest achieves the maximum score of 5 correct, i.e., 100%).

Table A4. Normative data for the DMS Response Latency (accuracy and accuracy covert condition) stratified by sex.

Table A5. Normative data for the DMS Response Latency overt (simultaneous) condition stratified by age and sex.

Table A6. Normative data for the DMS probability of an error after an incorrect response score, stratified by age and level of parental education (LPE).