3,085
Views
2
CrossRef citations to date
0
Altmetric
Research Articles

Relations between Executive Functions, Theory of Mind, and Functional Outcomes in Middle Childhood

, ORCID Icon, , ORCID Icon & ORCID Icon
Pages 518-536 | Received 07 Feb 2021, Accepted 27 Sep 2021, Published online: 12 Oct 2021

ABSTRACT

This study examined whether hot and cool executive functions (EFs) differentially predicted functional outcomes and the independent and mediating roles of theory of mind (ToM). 126 children completed tests of hot and cool EF, ToM, intelligence, and academic achievement. Parents completed questionnaires of peer problems and prosocial behavior. Hot and cool EFs differentially predicted intelligence and academic achievement, supporting a hot-cool distinction. ToM predicted word reading and prosocial behavior but did not mediate any associations between EF and functional outcomes. Findings contribute to current understandings of EF and its relationship with functional outcomes in middle childhood.

Executive function (EF) broadly refers to a range of cognitive processes that support complex, goal-directed behavior, especially in novel situations or that require conscious effort or control (Zelazo, Muller, Frye, & Marcovitch, Citation2003). EF has been linked with many functional outcomes including general intelligence (Ackerman, Beier, & Boyle, Citation2005; Friedman et al., Citation2006), academic and work performance (Gathercole, Pickering, Knight, & Stegmann, Citation2004; St Clair-Thompson & Gathercole, Citation2006), and social competence (Ciairano, Visu-Petra, & Settanni, Citation2007; Clark, Prior, & Kinsella, Citation2002; Nigg, Quamma, Greenberg, & Kusche, Citation1999). However, previous studies focused mainly on cool EF, and not many studies have focused on the relation between hot EF and these functional outcomes. Moreover, most of these studies have examined these relations in early childhood (Best, Miller, & Jones, Citation2009; Zelazo et al., Citation2003), adolescence (Seguin, Arseneault, & Tremblay, Citation2007) and adulthood, (Zelazo, Craik, & Booth, Citation2004) and there is a gap of evidence for middle childhood. Thus, the current study aimed to investigate how cool and hot EF relate to these outcomes in 5- to 12-year-old children.

The development of EF processes follows a protracted course, improving throughout childhood into adolescence and young adulthood (Anderson, Citation2002; Wilson, Andrews, Hogan, Wang, & Shum, Citation2018) and then declining in older age (Zelazo et al., Citation2004). EF can be divided into two forms based on motivational significance: cool and hot EF (Hongwanishkul, Happaney, Lee, & Zelazo, Citation2005). Cool EFs are involved when problem solving is relatively abstract and decontextualized, whereas hot EF processes are used when decision-making involves events that have emotionally significant consequences (i.e. meaningful losses or rewards; Zelazo et al., Citation2003).

Cool EF is considered to have three components, which are working memory, inhibition, and shifting (Miyake et al., Citation2000). Working memory is the active manipulation of information and the ability to monitor incoming information and update representations to maintain task-relevant information, rather than passive temporary storage of information. Inhibition is the ability to suppress a dominant response when necessary. Shifting is the ability to switch between tasks, strategies, or mental sets, sometimes referred to as cognitive flexibility or switching (Crone, Ridderinkhof, Worm, Somsen, & van der Molen, Citation2004). On the other hand, hot EF involves decision-making within a motivational context. It can be measured by gambling tasks in which rewards are won and lost (Crone & van der Molen, Citation2004) or tasks testing the extent to which individuals can delay gratification (Wilson, Andrews, & Shum, Citation2017).

EFs and important functional outcomes

The main purpose of the current research was to examine the relations between cool and hot EF on the one hand and the functional outcomes of intelligence (fluid, crystallized), academic achievement (word reading, numerical operations), and social competence (peer problems, prosocial behavior) on the other hand. The research is based on the framework proposed by Weimer et al. (Citation2021), which is based on their extensive review of existing research. The model integrates the constructs of self-regulation (including cool EF and hot EF), theory of mind (ToM) and functional outcomes, such as those examined in the current research. It also recognizes the influence of sociocultural, linguistic, and contextual factors and neurodevelopmental cascades. A key aspect of the framework for our current purposes is that the associations between self-regulation and functional outcomes are mediated by ToM. ToM is the understanding that individuals have internal mental states and that these mental states form the basis of their own and others’ behaviors, feelings and reactions (Woodruff & Premack, Citation1978). Aspects of ToM are present in infancy and development proceeds through early childhood (Andrews et al., Citation2003; Wellman & Liu, Citation2004) and continues through middle childhood and into adolescence (Devine & Hughes, Citation2013), although these latter periods are less well researched than the preschool years.

EF and intelligence

Despite ongoing debate surrounding the precise nature and strength of the relation between EF and intelligence, it is generally understood that the two constructs are related (Ackerman et al., Citation2005). Fluid intelligence (Gf) allows individuals to reason and problem-solve without having to rely on other knowledge and is often measured using tests of inductive relational reasoning (Todd, Andrews, & Conlon, Citation2019). The prefrontal cortex is one brain region that supports performance on tests of Gf (e.g., Raven’s Progressive Matrices; Todd et al., Citation2019), which is also believed to support cool EF (Gray, Chabris, & Braver, Citation2003). Therefore, the two abilities are interconnected but also separable. Crystallized intelligence (Gc), on the other hand, is associated with stored knowledge (Cattell & Horn, Citation1978) and may be impacted by additional factors such as access to formal education or socioeconomic status. Therefore, it is important to consider and assess these intelligence types separately to determine if EF is a stronger predictor for one over the other or if EF has equal predictive strength. While it is expected that a strong relation between cool EF and intelligence exists, the relation between hot EF and intelligence is less clear.

Previous research has outlined that working memory (Ackerman et al., Citation2005) and inhibition (Dempster, Citation1991), both cool EF components, are associated with intelligence. However, other research in adults with brain injury (Bar-on, Tranel, Denburg, & Bechara, Citation2003) and 3- to 5-year-old children (Hongwanishkul et al., Citation2005) found no relation between hot EF and intelligence. Most existing studies assessing the relation between hot EF and intelligence in adults have found that Gf is correlated with performance on gambling tasks (Demaree, Burns, & DeDonno, Citation2010). However, when assessed in children, fluid intelligence did not contribute to performance on a gambling task (Crone & van der Molen, Citation2004).

It was therefore anticipated that cool EF would be a stronger predictor of intelligence than hot EF. The Cattell-Horn-Carrol (CHC) theory (Alfonso, Flanagan, & Radwan, Citation2005) proposes that the relation between intelligence and EF should be stronger for Gf than that of g (unitary conception of intelligence) or Gc (Ackerman et al., Citation2005). Friedman et al. (Citation2006) reported that EF was related similarly to Gf, Gc, and g in an adult sample, however, when discussed in developmental terms, Gf is believed to support knowledge aqcuisition (Gc) over time. Therefore, it was anticipated that a stronger relation would exist between EF and Gf, than EF and Gc in a sample of children aged 5- to 12-years-old. It is unclear whether any associations between EF and intelligence are mediated by ToM and whether mediation effects differ for cool versus hot EF and for Gf versus Gc. Mediation effects would be consistent with Weimer et al.’s (Citation2021) model.

EF and academic achievement

The importance of examining hot and cool EF in children and how they relate to important outcomes like academic skills and behavior problems is increasingly being acknowledged (Kim, Nordling, Yoon, Boldt, & Kochanska, Citation2013). The identification of specific components of EF that are imperative for academic success may lead to better developed targeted interventions. This knowledge could be used to shape programs and develop resources, as well as direct educational funding to support EF development in order to improve academic achievement. When examining overall EF in preschool children aged 5 to 8 years, EF accounted for substantial variability in mathematical, reading, and spelling achievement two years later, with low EF being specifically associated with significant academic disadvantage in early school years (Röthlisberger, Neuenschwander, Cimeli, & Roebers, Citation2013). Therefore, EF may be considered as a marker of risk for academic diabilities. The majority of the previous research conducted on EF and academic achievement has focused on cool EF, with little research focusing on hot EF due to the relatively short history of this construct. Previous research examining the relation between cool EF and a number of academic domains including mathematics, science, and reading has found that cool EF significantly predicts academic performance (Brock, Rimm-Kaufman, Nathanson, & Grimm, Citation2009; Gathercole & Pickering, Citation2000; Gathercole et al., Citation2004; Sasser, Bierman, & Heinrichs, Citation2015; St Clair-Thompson & Gathercole, Citation2006; van der Sluis, de Jong, & Van Der Leij, Citation2007). One study assessed 173 kindergartners (approximate ages 5–6 years), reporting that cool EF predicted mathematical achievement, learning-related classroom behaviors, and observed engagement, however, hot EF did not predict any achievement or behavior outcomes when examined concurrently with cool EF (Brock et al., Citation2009).

Due to the small number of studies and mixed findings, the relation between academic achievement and hot EF is less clear, especially in middle childhood. Some previous research outlines a relation between academic achievement and the ability to delay gratification when examined using a sample of preschoolers (M age = 53.2 months, SD = 5.3 months at initial assessment; M age = 189.1 months, SD = 21.0 months at follow-up) longitudinally (Mischel, Shoda, & Rodriguez, Citation1989), and concurrently during adolescence (Study 1 M age = 13.4, SD = 0.37; Study 2 M age = 13.8, SD = 0.51; Duckworth & Seligman, Citation2005). As noted, hot EF tasks such as delay of gratification involve a motivational component and this might underpin the associations with academic achievement observed in these studies. However, several other studies did not find any significant relation between academic achievement and hot EF (Brock et al., Citation2009; Kim et al., Citation2013).

Therefore, it was predicted that in a sample of 5 to 12 year olds that cool EF would be a stronger predictor of academic achievement, as measured by word reading and mathematics, than hot EF. Although because previous findings with younger children are mixed, it is suggested that hot EF may still contribute to academic achievement because as children become older environmental supports are removed gradually, and they are required to manage their own academic workload (Brock et al., Citation2009). The current study will contribute to this important topic and provide some clarification on the influence of cool and hot EF on academic achievement.

ToM has been shown to significantly predict reading skills, specificially letter knowledge, during preschool (ages 3 years 9 months to 5 years 8 months; Blair & Razza, Citation2007), but there was no relation to mathematical skill. A study involving older children (7- to 10-year-olds) reported that ToM was associated with reading comprehension, but not with mathematics achievement (Cantin et al., Citation2016). This issue will be explored in the current study and the possible role of ToM as a mediator of any associations between EF and academic achievements will be assessed.

EF and social competence

In the model proposed by Weimer et al. (Citation2021), EF is a component of self-regulation. Self-regulation within social settings is important for the development of social competence (Green & Rechis, Citation2006). The ability to select and engage in socially competent behaviors is necessary for forming healthy relationships with peers and cooperating with others, and is supported by EF (McClelland et al., Citation2007). Social competence is assessed using various methods, including ratings from significant others (i.e., parents, teachers, and peers; Nigg et al., Citation1999), behavioral or observational measures (e.g., Ciairano et al., Citation2007), and measures of sociometric status (e.g, Bosacki & Astington, Citation1999).

Previous research suggests that there is an association between EF and social outcomes. A longitudinal study (N = 164) followed children from preschool (M age = 4.49, SD = 0.31) through to the third grade (Sasser et al., Citation2015). EF in preschool was found to significantly predict later math skills, academic functioning, and social competence. Better EF performance is associated with higher ratings of social competence (Clark et al., Citation2002). Equally, deficits in EF are often associated with socially inappropriate behavior (e.g., aggression; Raaijmakers et al., Citation2008). Inhibition, a cool EF component, has often been associated with multiple indices of social competence, such as cooperative problem solving (Ciairano et al., Citation2007; McClelland et al., Citation2007) and teacher-rated social competence (Nigg et al., Citation1999) during both preschool and middle childhood. Riggs, Blair, and Greenberg (Citation2004) found that when studied longitudinally, deficits in cool EF preceded the onset of problem behaviors reported by parents and teachers.

Limited research has been conducted that directly compares hot and cool EF and their relation to social outcomes. Theoretically, social problem solving should be strongly predicted by hot EF due to the inherent motivational salience of social situations (Zelazo, Qu, & Muller, Citation2005). This notion is supported by previous research findings outlining that delayed gratification significantly predicted behavior problems when reported by mothers, fathers, and teachers, whereas, inhibition and shifting, both cool EF measures did not (Kim et al., Citation2013). Olson (Citation1989) reported both hot (delay of gratification) and cool (inhibition) measures significantly predicted negative peer reports for preschool children (M age = 4 years 8 months, SD = 10 months), however the inability to delay gratification was a stronger predictor than motor and cognitive inhibition deficits.

The previous evidence suggests that both hot and cool EF are important in the development of social competence. In particular, the cool EF component of inhibition (Ciairano et al., Citation2007; McClelland et al., Citation2007; Nigg et al., Citation1999) and the hot EF component of delayed gratification (Olson, Citation1989) are important predictors. ToM also predicts social competence (Bosacki & Astington, Citation1999; Razza & Blair, Citation2009), and its role as a mediator of other associations seems plausible. When considering academic achievement and social outcomes jointly, it is expected that both hot and cool EFs would contribute to both domains during middle childhood. However, it would be expected that cool EF would show a stronger relation with academic performance (Brock et al., Citation2009; St Clair-Thompson & Gathercole, Citation2006), while hot EF would show stronger relations with social functioning measures due to the inherent motivational salience of social relationships and in line with previous research that shows a relation between social competence and emotional intelligence (Song et al., Citation2010). Therefore, based on the previous research and theory-based predictions, it was anticipated that hot EF would be a stronger predictor of social competence than cool EF in children aged 5 to 12 years. It was predicted that hot EF would be positively associated with prosocial behavior and negatively associated with peer problems.

Aims and hypotheses

Little research has been conducted to examine the associations between hot and cool EF and functional outcomes in middle childhood. The current research will link hot and cool EF with functional outcomes (i.e., intelligence, academic performance, and social functioning), thereby contributing to the understanding of EF function and development during middle childhood. Inclusion of ToM as a potential mediator of the associations between EF and functional outcomes has the potential to clarify the mixed findings of previous research. Fluid and crystallized intelligence will be examined separately to determine if EF is a stronger predictor of one over the other. The current study aimed to determine whether hot and cool EFs differentially predict functional outcomes of intelligence, academic achievement, and social competence in children aged 5 to 12 years, and the extent to which ToM mediates the relations. It was hypothesized that hot and cool EF would show different association patterns with important functional outcomes. Specifically, cool EF was hypothesized to be a stronger predictor of intelligence and academic achievement, whereas hot EF was expected to explain more variance in social competence.

Method

Participants

One hundred and twenty-six children (67 females) aged 5–12 years (M = 8.43, SD = 2.12) participated in the current study. All participants attended a suburban Brisbane primary school in Australia and were recruited via an information package sent home with students by the school. A parent or guardian was recruited alongside each child and provided additional information. Prior to the assessment, parents reported that their child had no history of brain injury, or diagnosis of learning or behavioral disorder.

The mean parental socioeconomic status (SES; based off primary household wage earner) for the entire sample was 3.53 out of 7 (SD = 0.92) indicating a predominately middle-class sample (Scale of Occupational Prestige; lower numbers reflect higher SES; Daniel, Citation1983). The school’s Index of Community Socio-Educational Advantage (ICSEA; available from the My School website; www.myschool.edu.au) was 1118, somewhat higher than the national average of 1000. Despite this score, the school cohort was made up of a broad distribution of families from all income groups.Blinding material (1st sentence of Procedure section): Participants were treated in accordance with the National Statement of Ethical Conduct in Research involving Humans, and ethical clearance was obtained from the Griffith University Ethics Committee.

Materials

Executive function

Cool EF measures

The Spatial Working Memory Task (SWM) from the Cambridge Neuropsychological Test Automated Battery (CANTAB; Cambridge Cognition, Citation2006) was used to measure working memory. The task required individuals to search through boxes displayed on a computer screen to find squares hidden under the boxes. The task consisted of 12 trials, with varying difficulty (i.e., four, six, or eight boxes). Each square was hidden under a predetermined, but seemingly random box. Once a square had been found under the box it would not be found under the same box in the same trial. Therefore, participants needed to remember where they had previously searched and avoid searching under the same boxes. If a participant returned to a previously searched empty box a within-search error was recorded, while a between-search error was recorded when a participant returned to a box where they had already found a square. An overall measure of working memory efficiency was calculated as the total error score, with higher scores indicating poorer performance.

The Stop Signal Test (SST) from the CANTAB was used to measure inhibition. Participants were instructed to respond to stimuli (arrows) presented on a computer screen as quickly as possible, but to withhold a response when the arrow was paired with an auditory beep tone. When a left-pointing arrow was displayed, participants were to respond by pressing the left button and the right button when a right-pointing arrow was displayed. If the arrow was accompanied with an auditory stop tone, participants were to refrain from pressing any button and wait for the next trial to begin. In total the task was made up of five blocks, each containing 64 trials, with 25% of the trials accompanied by an auditory stop signal. When an incorrect left-right judgment was made, the word ‘wrong’ was displayed on the screen. No feedback was provided for correct responses or stop trial commission errors. The length of the delay prior to the stop signal was dynamically adjusted by the program based on previous performance of the participant. Therefore, the stop signal delay (SSD) decreased after unsuccessful inhibition and increased after successful inhibition based on a staircase method. The probability of successful inhibition over trials was around 0.5 for each participant. The stop signal response time (SSRT) was calculated by subtracting the arithmetic mean of the measured SSD at which the participant was able to stop 50% of the time (SSD 50%) from the median response time for the go trials. Therefore, the SSRT quantified the covert stopping process and provided an efficiency of inhibitory control index, with higher scores indicating poorer performance.

The Intra-Extra Dimensional Shifting Task (IED) from the CANTAB was used to measure attentional set shifting. Four boxes and two colored shapes that changed location on each trial were displayed on the computer screen. Participants were required to select one shape at random for their initial choice. After each choice, feedback was provided and the participants had to use the feedback to discern the sorting rule. When the sorting rule had been learned to criterion (i.e., six consecutive correct responses), the rule was changed. Feedback was provided to the participants indicating that their previously correct choice was now incorrect (i.e., the other colored shape was now the correct choice). Intradimensional shifts were reflected by the rule changes and occurred in stages 1 through 7. White lines were added to the display in stage 3 and remained through to stage 7. These lines remained irrelevant to the task; therefore, the participants were required to filter out the irrelevant and distracting information to successfully complete the remaining stages. The rule changed in the final two stages (8 and 9) so that the white lines were now the relevant sorting dimension, reflecting extra-dimensional shift. Therefore, the participant needed to recognize that the white lines were now relevant to the task and stop using the color of the shapes as the relevant sorting dimension. The test was discontinued at any stage if the sorting rule was not discerned within 50 trials. Shifting ability was assessed as total errors adjusted for the number of stages completed, with possible scores ranging from 0 to 225.

Hot EF measures

The Gambling Task from the CANTAB (Cambridge Cognition, Citation2006) was used to assess decision-making in a motivational context. Starting with 100 points, participants were instructed to accumulate as many points as possible. Red and blue boxes were presented on the computer screen in each trial. Participants were required to guess which box (red or blue) contained a yellow token. After participants made their selection, they wagered a percentage of their points based on how confident they were in their decision. The wagering stakes were presented one at a time, either in ascending (5%, 25%, 50%, 75%, 95%) or descending order (95%, 75%, 50%, 25%, 5%), wherein participants selected one value. The task consisted of 4 blocks, each with 9 trials in each of the ascending and descending conditions (total of 72 trials). Blocks were discontinued if points reached 1 or 0. A decision quality score was derived from the proportion of trials in which the child chooses the more likely option, with scores ranging from 0 to 1. A risk adjustment score provided a measure of the tendency to bet a higher proportion of points when a large majority of the boxes are of the chosen color, ranging from −4.6 to 4.6 with higher scores indicating better performance. The delay aversion score provided a measure of capacity to delay responses by subtracting the mean proportion of gambled points on ascending trials from those on descending trials, with scores ranging from −.90 to .90.

The Gift Delay Task (Wilson et al., Citation2017) was used in the current research to assess delayed gratification. Participants were shown a display box that was filled with gifts including novelty stationary items and toys (i.e., stickers, pens, erasers, balls etc.) and was kept in full view of the participants throughout the assessment session. First, the children were presented with a small gift box containing some gift items. Participants were not able to see the content of the box because the lid was closed, however they were told that the box contained items similar to those in the display box. The participants were told that they could receive the small gift box now, a medium-sized gift box (containing more items) halfway through the assessment, or a large giftbox (containing even more items) if they waited until the end of the assessment. If the child chose to delay, the small gift box was placed on the table, remaining visible whilst other measures were administered. Medium and large gift boxes were presented to the participants at pre-set times within the assessment session. Participants’ responses were scored 0 if they chose the small gift box, 1 if they chose the medium gift box, and 2 if they waited until the end for the large gift box, therefore higher scores indicated ability to delay gratification for longer. If children changed their mind during the assessment and asked to open the box they had been previously offered, they were given the box during a break between tasks and scored according to the box they received.

Theory of mind

The Strange Stories Task was used to measure ToM as it has previously been used to measure the development of advanced ToM in normally developing children (O’Hare, Bremner, Nash, Happé, & Pettigrew, Citation2009). The task consists of 12 stories and includes the following scenario types: lie, white lie, joking, pretending, misunderstanding, persuasion, appearance reality, figure of speech, sarcasm, forget, double bluff, and contrary emotions. Each story contained a character (X) who said something that was untrue. Children were initially asked ‘Is it true, what X said?’ to check their understanding of the story. Secondly, ToM was assessed by asking the question, ‘Why did X say this?’ A score of 0 was given if the participant indicated incorrect or physical state responses, 1 for partial psychological state responses, and 2 for full and accurate psychological state responses. For the current study, the stories were independently classified according to their level of affective tone (low versus high) by two authors wherein there was 100% agreement. Only stories with high affective tone (i.e., requiring explanations that referred to feelings and emotional states of the protagonist, for example, offense, guilt, hurt) were used in the current study to measure hot EF. A ToM score was the average score for five stories (i.e., lie, white lie, sarcasm, persuasion, and contrary emotions), with higher scores indicating more advanced theory of mind.

Functional outcomes

Intellectual functioning

The Wechsler Abbreviated Scale of Intelligence (WASI; The Psychological Corporation, Citation1999) was used to obtain an estimate of intelligence. The two subtests of this instrument provide a measure of Gf (Matrix Reasoning subtest) and Gc (Vocabulary subtest) and is appropriate for use for ages 6 to 89 years. Administration of the WASI is substantially shorter than the WISC-IV, therefore, useful for research. Acceptable correlations (.81) between the two-subtest version of the WASI and the full scale WISC-III IQ indicate that the WASI is a reliable test for estimating IQ. Additionally, inter-rater reliability generally exceeds .90, test-retest reliability for participants aged six to eleven years old is .83, and age adjusted T scores are available. The raw scores were computed for use in all analyses.

Academic achievement

The Word Reading and Numerical Operations subtests from the Wechsler Individuals Achievement Test, second edition, Australian Abbreviated version (WIAT-II; Harcourt Assessment, Citation2007), were used to measure academic ability in reading and numeracy. The Word Reading subtest required individuals to read a list of words that increase in difficulty. The initial items also assess pre-reading skills such as letter knowledge and letter-sound associations. The second subtest, Numerical Operations, requires individuals to solve paper and pencil mathematical problems that increase in difficulty. Initial items also assess preliminary numeracy knowledge such as number identification and counting ability. The raw scores were computed for use in all analyses. Administration is appropriate for those aged 5 to 85 years and normative data is available based on a sample of 1163 Australian children/adolescents. Moderate correlations exist between the WASI and WIAT (.61 for Word Reading and .60 for Numerical Operations).

Social competence

Social competence was measured using the peer problems and prosocial behavior subscales of the Parent-reported Strengths and Difficulties Questionnaire (SDQ; Goodman, Citation2000). The questionnaire is used widely in clinical practice and research and is made up of five subscales: (a) Conduct Problems; (b) Emotional Symptoms; (c) Hyperactivity; (d) Peer Problems; and (e) Prosocial Behavior. Responses are recorded on a 3-point scale (not true, somewhat true, certainly true) and include items like ‘Rather solitary, tends to play alone’ (peer problems) and ‘Considerate of other people’s feelings’ (prosocial behavior). Normative data is available for an Australian sample, made up of 910 parents, teachers and children (aged 7 to 17 years) who were randomly sampled through Victorian government schools (Mellor, Citation2005). The parent version of the SDQ has an internal consistency of .57 for the peer problems subscale and .65 for the prosocial behavior subscale (Goodman, Citation2000).

Procedure

Participants were treated in accordance with the National Statement of Ethical Conduct in Research involving Humans, and ethical clearance was obtained from the Griffith University Ethics Committee. Written consent was obtained from both the school Principal and from each child’s parent or guardian prior to starting the study. Additionally, the studies aims and methods were verbally described to each child and each child gave verbal assent prior to study commencement. Each child was assessed individually over two sessions (1 hr each) a week apart, to minimize fatigue. Assessments were completed in a quiet, well-lit room, free from distractions at the child’s school. The assessment battery featured tasks administered in a fixed order designed to vary task demands and modalities across successive tests, as well as maintain interest and engagement. Session 1 included the following tests in this set order: gift delay, spatial working memory, stop signal, intra-extra dimensional shifting, and Cambridge gambling task. Session 2 consisted of the following tests in the following set order: WASI Matrix Reasoning and Vocabulary subtests, the Strange Stories task, WIAT-II Word Reading and Numerical Operations subtests. After the first session, a parent questionnaire package (including the SDQ) was sent home with the children and returned via Reply Paid post.

Statistical analysis

Mediation analyses were performed using PROCESS Procedure for SPSS version 3.5. The indirect effects of the potential mediator (ToM) were tested with 5000 bootstrap samples and 95% bias-corrected confidence intervals. A mediated association is significant when confidence intervals do not contain zero. Analyses were conducted examining whether ToM mediated the associations between cool EF and each functional outcome (while controlling for age and hot EF) and between hot EF and each functional outcome (controlling for age and cool EF). The analyses yielded no evidence of significant meditation by ToM. In each analysis, the relevant confidence interval spanned zero.

Given that there was no evidence of mediation, we instead report a series of multiple regression analyses (MRA) assessing the relations between hot and cool EF, ToM, and the functional outcomes of intelligence, academic achievement, and social competence. Initially composite measures of hot and cool EF were used as predictors. These were then followed up with more in-depth analyses of individual EF measures which had age-partialled correlations of ≥.2 with the criterion. For the intelligence analyses 5-year-olds’ data was excluded due to a different measure of intelligence used, therefore raw scores could not be combined (n = 107). The net regression analysis was devised by Cohen et al. (Citation1990) and assesses whether a set of predictors have, collectively and individually, a different relation to two different criterion variables within the same sample. A net regression analysis was performed to assess whether the relation between EF and intelligence was different for Gf (measured by Vocabulary subtest) and Gc (measured by Matrix Reasoning subtest). All predictor and criterion variables were standardized prior to the net regression analysis.

Questionnaire responses were missing for 21 children (parents did not return questionnaire), however, children who had missing parent data did not differ from those with parent data on EF, intelligence, or academic achievement measures. Data of these participants were excluded listwise for relevant analyses. If age and/or gender was significantly correlated with the criterion, they were included as predictors, where appropriate.

Results

Intercorrelations amongst measures

Descriptive statistics and correlation analyses for the hot and cool EF composite scores, ToM, age, gender, and functional outcome measures can be found in . To assess hot and cool EF, composite measures of hot and cool EF were constructed from the averaged standardized scores. All variables were significantly correlated with age. Gender did not significantly correlate with any variable. All other EF, ToM, intelligence, and academic achievement measures were significantly correlated.

Table 1. Correlations, means and standard deviations for age, gender, cool EF, hot EF, ToM, intelligence and academic achievement measures

EF, ToM and intelligence

Predicting crystallized intelligence

A standard MRA was used to assess whether cool EF was a stronger predictor of Gc than hot EF. The criterion was vocabulary. Age, composite measures of hot and cool EF and ToM scores were included as predictors. In combination the predictors accounted for 65.6% of the variance in Vocabulary scores, R2 = .66, adjusted R2 = .64, F(4, 101) = 48.12, p < .001, f2 = 1.91 (large effect size). Age and cool EF were found to be significant predictors (see ), however, hot EF and ToM were not.

Table 2. Multiple regression of vocabulary on age, cool EF, hot EF, and ToM, and individual EF tasks

To further analyze the source of this pattern, a subsequent MRA was conducted with age and individual EF tasks that had age-partialled correlations of .2 or greater with the criterion (Vocabulary) as predictors. Therefore, age and SSRT (r = .216, p = .027) were entered into the regression as predictors. Overall, the predictors accounted for a significant 64% of the variance in Vocabulary scores, R2 = .64, adjusted R2 = .63, F(2, 104) = 92.29, p < .001, f2 = 1.78 (large effect size). Age and inhibition (cool EF) were significant predictors of Vocabulary (see ).

Predicting fluid intelligence

A standard MRA was performed to assess whether cool EF was a stronger predictor of Gf than hot EF. The criterion was Matrix Reasoning. Age, composite measures of hot and cool EF, and ToM were entered as predictors. Overall, the MRA was significant, R2 = .67, adjusted R2 = .66, F(4, 101) = 51.56, p < .001, f2 = 2.04 (large effect size), with predictors accounting for 67.1% of the variance in Matrix Reasoning. Age, cool EF, and hot EF significantly predicted Matrix Reasoning, but ToM did not (see ).

Table 3. Multiple regression of matrix reasoning on age, hot and cool EF, ToM, and individual EF tasks

To explore the source of this result pattern further, a subsequent MRA was performed with age and individual EF tasks that had age-partialled correlations of .2 or greater with the criterion (Matrix Reasoning) as predictors. Therefore, age, SWM Total Errors (r = −.349, p < .001), SSRT (r = −.232, p = .017), IED Total Errors Adjusted (r = −.311, p = .001), CGT Decision Quality (r = −.337, p < .001), and CGT Risk Adjustment (r = −.322, p = .001) were entered into the regression as predictors. Overall, the predictors accounted for 67.6% of the variability in Matrix Reasoning, R2 = .68, Adjusted R2 = .66, F(6,99) = 34.43, p < .001, f2 = 2.09 (large effect size). Age, two cool EF measures (inhibition and shifting), and two hot EF measures (gambling decision quality and risk adjustment) were significant predictors of Matrix Reasoning (see ), but ToM was not.

Comparing the effect of EF on Gf and Gc

A net regression analysis was performed to test whether the relation between EF, ToM, and intelligence was different for Gf and Gc. Firstly, vocabulary scores were regressed on age, cool EF, hot EF, and ToM. The predicted vocabulary scores were saved (ZˆVocabulary), then subtracted from the standardized matrix reasoning scores (ZMatrix Reasoning) to create the difference scores (ZMatrix ReasoningZˆVocabulary). The difference scores were then regressed on the original set of predictors (age, cool EF, hot EF, and ToM). The net regression analysis (overall test R2) indicates whether the two criterion variables have a different relation to the set of independent variables and the tests of the individual beta weights indicate which predictors differ significantly, and in which direction. The MRA was significant, R2 = .21, adjusted R2 = .18, F(4, 101) = 6.78, p < .001, f2 = 0.27 (medium effect size) with the predictors accounting for 21.2% variability in the difference scores. Only age (B = −.20, 95% CI = [−.295, −.100], β = −.56, p < .001) had significantly different associations with Gf and Gc. The negative coefficient for age indicates that the association with age was significantly smaller for Gf than for Gc. The associations of cool EF (B = .20, 95% CI = [−.02, .43], β = .21, p = .070), hot EF (B = .18, 95% CI = [−.07, .43], β = .17, p = .15), and ToM (B = −.09, 95% CI [−.25, .07], β = −.12, p = .29) with Gf and Gc did not differ significantly in strength.

EF, ToM, and academic achievement

Predicting word reading

A standard MRA was used to assess whether cool EF was a stronger predictor of academic achievement (viz., word reading) than hot EF. The WIAT word reading score was entered as the criterion, and age, composite hot and cool EF, and ToM scores were entered as predictors. In combination, the predictors accounted for a significant 77% of the variability in word reading, R2 = .77, adjusted R2 = .76, F(4, 119) = 99.55, p < .001, f2 = 3.35 (large effect size). Age, cool EF, and ToM significantly predicted word reading, but hot EF did not (see ).

Table 4. Multiple regression of word reading scores on age, cool EF, hot EF, and ToM

A subsequent MRA with age and individual EF tasks that had an age-partialled correlation of .2 or greater with the criterion (word reading) as predictors was conducted to further explore the source of this result. Age, SWM total errors (r = −.29, p = .002), SSRT (r = −.26, p = .008), IED total error adjusted (r = −.22, p = .024), gift delay score (r = −.339, p = .001), and CGT decision quality (r = .234, p = .016) were entered into the regression as predictors along with ToM (r = .245, p = 006). In combination, the predictors accounted for a significant 78.3% of the variance in word reading, R2 = .783, adjusted R2 = .77, F(7, 118) = 60.95, p < .001, f2 = 3.61 (large effect size). Age, one hot EF task (Gift Delay), and ToM significantly predicted word reading (see ).

Predicting mathematical ability

A standard MRA was utilized to determine if cool EF was a stronger predictor of another measure of academic achievement (viz., mathematical ability) than hot EF. The WIAT numerical operations score was entered as the criterion variable and age, composite measures of hot and cool EF, and ToM were entered as predictors. The overall model was significant, R2 = .88, F(4, 119) = 213.385, p < .001, f2 = 7.20 (large effect size) with the predictors accounting for 87.8% of the variance in numerical operations. Only age significantly predicted mathematical ability, however, there was a trend toward significance (p = .058) for cool EF (see ).

Table 5. Multiple regressions of numerical operations scores on age, cool and hot EF, and ToM

A subsequent MRA with age and individual EF tasks that had an age-partialled correlation of .2 or greater with the criterion (numerical operations) as predictors was conducted to further explore the source of this result. Age, SWM total errors (r = −.32, p = .001), and CGT decision quality (r = .21, p = .029) were entered into the regression as predictors. In combination, the predictors accounted for a significant 88.5% variability in numerical operations, R2 = .89, adjusted R2 = .88, F(3, 122) = 313.31, p < .001, f2 = 7.70 (large effect size). Age and one cool EF measure (working memory) significantly predicted numerical operations (see ).

EF, ToM, and social competence

displays the descriptive statistics and correlation matrix for hot and cool EF composite scores, ToM, and parent-reported SDQ social competence (peer problems and prosocial behavior). The prosocial behavior subscale of the SDQ was significantly correlated with gender. Parents reported lower levels of prosocial behavior for boys than girls. Peer Problems on the SDQ was significantly negatively correlated with age, indicating higher levels of peer problems for younger children. No significant correlations were found between the hot or cool EF and either prosocial behavior or peer problems. ToM (sr2 = .07, p = .005) and gender (sr2 = .06, p = .011) each contributed significant independent variance to the prediction of parent-reported prosocial behavior. No significant correlations were found between the individual tasks and peer problems after controlling for age.

Table 6. Correlations, means and standard deviations for age, gender, EF, ToM, and measures of social competence

Discussion

The current study aimed to examine whether hot and cool EF predicted functional outcomes of intelligence, academic achievement, and social competence. The independent and mediating roles of ToM were also studied. Evidence showing that hot and cool EF differentially predict functional outcomes would support a hot-cool EF distinction in middle childhood and provide further understanding of EF function and its development during the middle years. The findings showed that differential prediction by hot versus cool EF depended on the functional outcome being investigated and whether composite or individual EF measures were employed. ToM did not mediate the relationships between EF and functional outcomes in the current study so in this sense our findings do not support the framework proposed by Weimer et al. (Citation2021). However, ToM did make an independent contribution to the prediction of two functional outcomes, as discussed further below.

EF, ToM, and intelligence

It was hypothesized that cool EF would be a stronger predictor of intelligence than hot EF. This hypothesis was supported when examining Gc measured via vocabulary scores. Cool EF was a stronger predictor than hot EF, contributing significant unique variance. When the individual tasks were analyzed, only age and inhibition (cool EF) significantly predicted Gc. The relation between hot and cool EF and Gf, measured by matrix reasoning, was inconsistent with the hypothesis. Both hot and cool EF significantly predicted matrix reasoning. When the individual tasks were examined further, results revealed that two cool EF components (inhibition and shifting) and two hot EF components (CGT decision quality and risk adjustment) significantly predicted Gf. Overall, the findings are in line with previous research that outlines relations between cool EF and intelligence (Ackerman et al., Citation2005; Dempster, Citation1991), however, is inconsistent with research suggesting no relation between hot EF and intelligence (Bar-on et al., Citation2003; Hongwanishkul et al., Citation2005).

Consistent with previous research, inhibition for children may be an essential building block for the later development of more complex problem solving and reasoning, providing the ability to stop and think (Best et al., Citation2009). This ability to pause, reflect, and shift perspectives when necessary may also support children’s development as they learn verbal labels for the entities, objects, and concepts that they come across. It is understood that inhibition supports problem solving and fluid reasoning by suppressing irrelevant thought processes or content, thereby aiding focused attention (Dempster, Citation1991) and attention shifting. Our findings that inhibition and shifting (IED) predicted Gf are consistent with this research. Working memory and intelligence are generally considered to be highly related constructs (Ackerman et al., Citation2005), and it has previously been argued that working memory and Gf are the same construct (Kyllonen, Citation2002). In addition, working memory has been found to significantly predict intelligence in previous research (Friedman et al., Citation2006). Therefore, the finding that working memory did not predict matrix reasoning in the current study is somewhat surprising. As expected, the zero-order correlation was strong (r = .63), but the unique contribution of working memory in the multiple regression did not reach significance (p = .07) perhaps because variance was shared with other predictors.

Decision quality on the Cambridge Gambling Task (measure of the number of times the individual chooses the most likely option) significantly predicted Gf. It may be possible that decision quality reflects a cooler function than the delay aversion index which might reflect hotter aspects of the task. One study (Hongwanishkul et al., Citation2005) has found a positive association between gambling performance and the Dimensional Change Card Sorting Task (a cool EF task that requires flexible rule use). The authors noted that successful performance on the children’s gambling task might require cool EF resources (i.e., working memory) to track both losses and gains over time.

Inconsistent to the prediction made in the current study, the relations between EF and Gc and Gf did not differ significantly. The associations of cool and hot EF with Gf and Gc did not differ significantly in strength. This finding is consistent with previous research that did not find any differences in the relations of EF with g, Gf, and Gc in an adult sample (Friedman et al., Citation2006). Therefore, the current findings provide some preliminary evidence that the relation between EF and intelligence is consistent at different points in the age span. Age contributed significantly more to Gc than Gf, which is consistent with a pattern of increasing knowledge with age, independent of EF and possibly reflecting the effects of acculturation and participation in a uniform system of formal education, and consistent with the CHC theory (Cattell & Horn, Citation1978).

Although ToM was significantly associated with Gc and Gf () it did not account for variance in either aspect of intelligence independently of age, cool EF, and hot EF () and there was no support for ToM as a mediator of the association between EF and intelligence.

EF and academic achievement

It was hypothesized that cool EF would be a stronger predictor than hot EF for both word reading and mathematics. This hypothesis was supported in relation to word reading when composite measures of cool EF and hot EF were employed (). Cool EF and ToM each significantly predicted word reading scores, but hot EF did not. This current finding is consistent with a growing body of literature (Gathercole & Pickering, Citation2000; St Clair-Thompson & Gathercole, Citation2006; van der Sluis et al., Citation2007) that outlines associations between cool EF (working memory and shifting) and word reading. However, a different pattern emerged when individual tasks were employed as predictors. While no individual measure of cool EF (working memory, inhibition, shifting) made a significant unique contribution, gift delay, a measure of hot EF, did make a significant unique contribution to prediction of word reading, consistent with Duckworth and Seligman (Citation2005). The current study highlights the importance of including both hot and cool EF.

ToM made a significant unique contribution to word reading in both analyses reported in . This is consistent with previous research that demonstrated links between ToM and aspects of language (Astington & Jenkins, Citation1999; Carlson, Moses, & Breton, Citation2002) and literacy in young children (Blair & Razza, Citation2007) and older children (Cantin et al., Citation2016).

The hypothesis was not supported for mathematics in that neither hot nor cool EF composite measures significantly predicted mathematics scores, when age and ToM were controlled. The unique contribution of cool EF approached, but did not reach, signficance (p = .058). When the individual tasks were entered, age and working memory (cool EF) were found to significantly predict mathematics, providing partial support for the hot-cool distinction. This is consistent with previous research (van der Sluis et al., Citation2007) that included multiple cool EF measures but found that only working memory was significantly related to arithmetic. It is likely that working memory is important to mathematical problem solving because facts and strategies that are stored in long-term memory need to be manipulated, therefore, placing considerable demand on information processing and storage (Gathercole et al., Citation2004). For example, problems with multiple steps (i.e., serial addition) require information to be updated in working memory until the ultimate solution is reached.

ToM did not account for significant unique variance in mathematics performance beyond age, hot EF, and cool EF. This contrasts with the significant contribution of ToM to word reading, the other aspect of academic achievement assessed in the current study. This difference is reminiscent of other research in which ToM was associated with letter knowledge (Blair & Razza, Citation2007) and reading comprehension (Cantin et al., Citation2016) but not with mathematics skills. The capacity to infer mental states of others (ToM) appears to facilitate language comprehension generally and word reading more specifically perhaps due to overlap in the brain regions or networks that underpin mentalizing and reading (Weimer et al., Citation2021).

EF, ToM, and social competence

It was hypothesized that hot EF would be a stronger predictor of social competence (both prosocial behavior and peer problems) than cool EF. These hypotheses were largely unsupported as neither hot nor cool EF were significant predictors of either outcome measure. However, ToM significantly predicted prosocial behavior after gender was controlled. Children who demonstrated more advanced understanding of the internal mental states of others in the Strange Stories task demonstrated more prosocial behaviors. This is consistent with the view that ToM plays a central role in social interactions (Weimer et al., Citation2021). ToM incorporates the ability to take account of other people’s perspectives, a skill that would likely enhance prosocial behavior.

No individual tasks significantly correlated with peer problems after controlling for age. The previous literature has generally found strong relations between EF and social competence measures (including peer- and teacher-ratings and observational measures of cooperative problem solving; Bosacki & Astington, Citation1999; Ciairano et al., Citation2007; McClelland et al., Citation2007; Nigg et al., Citation1999; Olson, Citation1989; Razza & Blair, Citation2009). One explanation for the discrepancy in findings is that the current study used a parent-report measure. This is because parents of school-aged children are likely to have fewer opportunities to directly observe their children in social situations with their peers compared to their teachers or peers themselves. Indeed, the frequency of parent-reported peer problems in the current study was quite low, in that 63.8% of parents reported that their child had zero problems or a single peer problem. Parents’ lack of awareness of their children’s problems with peers might contribute to the low reliability of the peer problems subscale (Goodman, Citation2000). Alternative social competence measures (i.e., teacher reports or measures of socioeconomic status) were beyond the scope of the current study for practical reasons (to minimize impact on the participating school). Therefore, no clear conclusions can be drawn from the current study about the relation between EF and social competence.

Implications

Overall, the results of the current study suggest different relations between hot and cool EF and ToM with the functional outcome of intelligence, academic achievement, and social competence. Cool EF was strongly associated with both crystallized and fluid intelligence, and the academic ability of word reading, while hot EF significantly predicted fluid intelligence, and ToM predicted word reading and prosocial behaviors. Overall, the predictive power of neither hot nor cool EF was stronger for fluid intelligence than crystallized intelligence. However, age predicted crystallized intelligence more strongly than it predicted fluid intelligence, consistent with CHC theory (Cattell & Horn, Citation1978).

The findings are relevant to the framework proposed by Weimer et al. (Citation2021), which links self-regulation with functional outcomes. The observed associations between EF and intelligence and academic achievement are consistent with that model because hot EF and cool EF are components of self-regulation. Other components of self-regulation, effortful control and emotional self regulation, were not included in the current research. The observed assocations between ToM and functional outcomes of word reading and prosocial behaviors are also consistent with the model. However we did not find support for the claim that ToM mediates the asociations between self-regulation and functional outcomes. Evidence for mediation might emerge if all components of self-regulation (rather than EF only) were assessed or if ToM was assessed using a broader range of measures. These are avenues for future research.

The different associations between hot and cool EF with important functional outcomes provides evidence of a hot-cool distinction in middle childhood, further highlighting the importance of these constructs and that assessment should include tests of both hot and cool EF as well as fluid and crystallized intelligence. Understanding the specific cognitive abilities that predict intelligence and academic achievement is important in the development of interventions to assist children struggling at school because low EF is related to academic disadvantage in early school years and is therefore considered a marker of risk for academic disability (Röthlisberger et al., Citation2013). While the current study is cross-sectional in nature, the findings are in line with previous longitudinal research (Brock et al., Citation2009) that suggest EFs and ToM are related to academic achievement. In particular, poor working memory may be a risk factor for numerical processing, whereas poor inhibition may be a risk factor for vocabulary problems, and difficulty delaying gratification might be a risk factor for word reading. It would be expected that interventions that target cool and hot EFs would be more appropriate than one that solely targets cool EF.

Limitations and future directions

The current sample was recruited from within a single urban school with a predominantly middle-class community. This might limit external validity so future research should seek to evaluate the generalizability of the current findings with a larger range of socioeconomic groups and residential settings (i.e., rural communities). As mentioned earlier it is possible that the parents lack awareness of their child’s social competence due to limited observational opportunities, therefore, future research should consider including multiple informant measures (i.e., parents, teacher, peers, and the child themselves) to obtain a clearer picture of functioning (Achenbach, Citation2006). Low reliability of the peer problems and prosocial behavior subscales might explain the absence of significant relationships between EF and social competence in our research.

Lastly, the current study has contributed to the developmental EF literature by administering hot and cool EF tasks to a middle childhood sample. There are however many other tasks but including them was beyond the scope of the current study, therefore, future research should include other cool EF measures such as planning, strategy generation, and complex attention (e.g., divided attention, selective attention; Anderson, Citation2002) and other hot EF measures such as delay discounting tasks to ensure that the broad construct of EF is being examined thoroughly. The inclusion of multiple measures would allow for the examination of latent variable structures of EF, therefore, resulting in purer measures of each component and the capacity to conduct causal modeling of EF with important functional relations.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the Griffith University Infrastructure Research Program; Australian Postgraduate Award.

References