2,693
Views
0
CrossRef citations to date
0
Altmetric
Developmetrics

Assessing student–teacher relationship quality in cross-cultural contexts: Psychometric properties of student–teacher relationship drawings

ORCID Icon, &
Pages 770-784 | Received 15 Jul 2020, Accepted 27 Jun 2021, Published online: 22 Jul 2021

ABSTRACT

The present study examined the psychometric properties of Student–Teacher Relationship Drawings (STRDs) to assess student–teacher relationship quality in a cross-cultural context. A sample of upper elementary school students from both the Netherlands (N = 752) and China (N = 574) was included. Results showed sufficient inter-rater reliabilities of all drawing dimensions for both the Dutch (.68 < ICC < .84) and Chinese samples (.72 < ICC < .85). Multiple group analyses supported partially strong invariance of STRD-dimensions across the Dutch and Chinese samples. Relationship drawing dimensions also showed moderate associations with student-reported relationship quality and engagement across both the Dutch and Chinese sample. Future cross-cultural research may therefore employ STRDs to assess students’ relationship experiences.

Ample evidence has supported the importance of affective student–teacher relationships (STRs) for students’ school adjustment (e.g. academic achievement, school engagement; Roorda et al., Citation2017, Citation2011). To assess students’ perceptions of STR quality, previous research often employed questionnaires (e.g. Chen et al., Citation2019; Jellesma et al., Citation2015). In cross-cultural studies, however, questionnaires need to be translated into different languages and students from different countries might interpret the same question (slightly) differently, which might affect their item responses (Pinto & Bombi, Citation2008). A promising alternative method to measure students’ relationship perceptions is Student–Teacher Relationship Drawings (STRDs; Harrison et al., Citation2007; McGrath et al., Citation2017; Zee et al., Citation2020). This method requires students to draw a picture of themselves and their teacher, and these drawings are then coded by independent raters. As STRDs hardly involve verbal statements, this task might make it easier to make cross-cultural comparisons of students’ relationship perceptions (Pinto & Bombi, Citation2008). Supporting this idea, children’s drawings of human figures were found to follow universal structures across cultures (Gernhardt et al., Citation2015).

Moreover, STRDs may capture different aspects of STRs than what is obtained from questionnaires. According to attachment theorists (Pianta et al., Citation2003), positive STRs provide students with a secure base to comfortably explore the school environment, whereas negative STRs hamper students’ eagerness to explore. Students develop their perceptions of positive and negative STRs based on their mental representations of the relationship (i.e. their feelings, behaviours, and emotions about themselves, their teachers, and their mutual relationships; Pianta et al., Citation2003). In line with attachment theory, relationship questionnaires assess students’ conscious relationship perceptions through several items that usually form two relationship quality dimensions: closeness (a positive STR reflecting the warmth and open communication between student and teacher) and conflict (a negative STR reflecting ambivalence and discordance in teacher–student dyads). STRDs, however, reflect students’ unconscious feelings about STRs that underly their conscious relationship perceptions, and can therefore assess their mental representations on a deeper level (Zee et al., Citation2020). Additionally, STRDs go beyond the closeness and conflict dimensions by assessing the quality of STRs across seven different dimensions, which may provide nuanced information of STRs above the information attained from relationship questionnaires (Pinto & Bombi, Citation2008; Zee et al., Citation2020).

One dimension of STRDs reflects positive STRs and parallels the closeness dimension in relationship questionnaires: pride/happiness (the presence of positive facial expressions and positive interactions between students and teachers). Five dimensions represent negative STRs, two of which are theoretically linked with the conflict dimension in relationship questionnaires: anger/tension (angry facial expressions and irritable symbols in the drawings) and bizarreness/dissociation (unusual symbols such as devils in the drawings). The other three dimensions describe aspects of negative STRs that are different from the conflict dimension. Vulnerability reflects students’ fear of their teachers, as identified by disproportionate sizes (e.g. very large teacher and very small student) and unnatural placement of figures (e.g. overlapping figures). In addition, role reversal assesses whether students regard teachers as unreliable and lacking authority, by drawing themselves larger than teachers. Emotional distance/isolation reflects students’ feelings of being estranged from their teachers, indicated by physical barriers and distance between students and teachers in the drawings. Finally, vitality/creativity (the degree of creativity, colourfulness, and details in the drawings) usually indicates positive STRs whereas sometimes it also represents high discordance in the relationship (e.g. detailed presentations of disturbing elements like devils). As such, STRDs tap different aspects of STRs than questionnaires. In line with this argument, Harrison et al. (Citation2007) found that relational negativity in STRDs was moderately correlated with teacher-reported closeness (r = −.28) and conflict (r = .28).

As far as we know, STRDs have only been applied in Western samples (Harrison et al., Citation2007; McGrath et al., Citation2017; Zee et al., Citation2020), but not in Eastern samples. The present study therefore aimed to investigate the cross-cultural application of STRDs (reliability, measurement invariance, and validity) in a sample of upper elementary school students from the Netherlands (a Western country) and China (an Eastern country). Although relationship drawings have previously been used in samples of kindergarteners (Harrison et al., Citation2007; McGrath et al., Citation2017), we focused on children in middle childhood because the quality of STRs becomes more important for older children to overcome difficulties at school (Roorda et al., Citation2011).

We first examined the inter-rater reliability of the seven STRD-dimensions. Second, we examined the measurement invariance of STRD-dimensions across Dutch and Chinese samples. Third, previous studies have supported the validity of STRDs (Harrison et al., Citation2007; McGrath et al., Citation2017) and students’ drawings of their classrooms (Longobardi et al., Citation2017) in Western samples. For instance, Harrison et al. (Citation2007) found that relational negativity in Australian children’s STRDs was associated with less child-reported teacher acceptance, less teacher-reported closeness, and more teacher-reported conflict (as measured by questionnaires); relationship negativity was also associated with more learning and behavioural problems, and lower school competence of the children. Therefore, we lastly examined the convergent validity of STRDs by investigating associations between STRD-dimensions and students’ reports on a relationship questionnaire. We also investigated the predictive validity of STRDs by exploring associations between STRD-dimensions and students’ school engagement in both countries. We expected to found sufficient inter-rater reliability and measurement invariance of STRD-dimensions across Dutch and Chinese samples. We also expected STRD-dimensions to be moderately related to student-reported relationship quality and engagement (Harrison et al., Citation2007).

Methods

Participants

The Dutch sample included 752 students (51.5% girls; Mage = 9.96 years; SD = 1.21) from third-to-sixth grade in 43 classrooms across eight elementary schools (Nurban school = 5) in the Netherlands. On average, there were 23 students (range = 8–29) in each class. Most students (71.5%) identified themselves as ethnically Dutch. Students’ teachers had a mean age of 40.26 years (SD = 11.27) and an average teaching experience of 14.01 years (SD = 11.68).

The Chinese sample consisted of 574 third-to-sixth graders (53.7% girls; Mage = 11.49 years; SD = 1.29) in 14 classrooms from three elementary schools (Nurban school = 2) in Zhejiang, China. There were on average 42 students in each class (range = 34–52). Most of the students (94.6%) belonged to the Han group. Their teachers had a mean age of 33.2 years (SD = 5.86) and an average teaching experience of 9.65 years (SD = 6.60). Information about students’ social-economic status was not available for either the Dutch or Chinese sample.

Procedure

Data collection was approved by the ethics committee of the University of Amsterdam (file number: 2016-CDE-7243). At the time of data collection (February and March 2017 in the Netherlands and March 2018 in China), the data collection procedures complied with the national laws in both countries. Research assistants and the first author contacted schools within their networks through email/telephone. After schools agreed to participate, students’ parents were sent an information letter, so that they could reject their children’s participation in this study. Students then completed a questionnaire about their relationships with their teacher and school engagement and made a drawing of themselves and their teacher. Data collection took place during school hours and teachers were absent to encourage students’ honest responses. The drawings were double-coded afterwards by trained (independent) raters.

Instruments

Student-teacher relationship drawings

Students were invited to ‘draw a picture of yourself and your teacher’ on a white A4 paper and could use all available drawing materials in the classroom (Harrison et al., Citation2007). Most of the students spent around 10 to 15 minutes finishing the drawing. Students’ drawings were double-coded on seven dimensions (see Introduction) by trained raters with Zee and Roorda (Citation2017) coding system on a 7-point Likert scale, ranging from 1 (little evidence for the construct) to 7 (ample evidence for the construct).

The seven dimensions were originally proposed in the Family Drawing Global Rating Scale by Fury et al. (Citation1997). This coding system was later adapted by Harrison et al. (Citation2007) and Zee and Roorda (Citation2017) to school contexts. The coding system of Fury et al. (Citation1997) has been validated both in Western and Eastern samples (Behrens & Kaplan, Citation2011; Jin et al., Citation2018), indicating that the manual can be used across cultures. The adapted coding manual has also been successfully used in a Dutch sample (Zee et al., Citation2020). Two native Dutch raters coded the Dutch drawings. The first author (native Chinese) and a third native Dutch rater coded the Chinese drawings. During training, the raters independently coded six example drawings. Their coding scores were then extensively discussed with the trainers and were adjusted to be as accurate as possible. To prevent the influence of children’s artistic ability on the coding, raters were instructed to ignore the artistic quality of the drawings while coding.

Relationship Questionnaire

Students’ conscious relationship perceptions were measured with the Closeness (eight items; e.g. ‘I feel at ease with my teacher’) and Conflict (10 items; e.g. ‘I easily have quarrels with my teacher’) subscales from the Student Perception of Affective Relationship with Teacher Scale (SPARTS; Koomen & Jellesma, Citation2015). Students answered each item on a 5-point Likert scale, varying from 1 (No, that is not true) to 5 (Yes, that is true). Previous studies have supported the reliability and validity of the SPARTS, as well as its measurement invariance across Dutch and Chinese samples (Chen et al., Citation2019). In the present study, the Closeness (αDutch = .84; αChinese = .84) and Conflict (αDutch = .83; αChinese = .71) subscales showed satisfactory reliability. Confirmatory factor analysis was used to examine the two-factor structure of the SPARTS across Dutch and Chinese samples. Model fit can be considered satisfactory when CFI values >.90 (Bentler, Citation1992), and RMSEA and SRMR values <.08 (Hu & Bentler, Citation1999). The two-factor model of Closeness and Conflict had a satisfactory model fit in the present study, χ2 (292) = 629.87, p < .001; RMSEA = .04, 90% CI [.037, .046], CFI = .915, SRMR = .056.

Students’ engagement

Students reported their engagement with 11 items from the short version of the Engagement versus Disaffection with Learning Questionnaire (Zee & Koomen, Citation2020). The items describe students’ active participation and emotional involvement in learning processes at school (e.g. ‘I try hard to do well in school ’). Students rated the items on the same 5-point scale as the SPARTS. The psychometric properties of the engagement questionnaire were supported in previous research (Zee & Koomen, Citation2020). In this study, the scale also showed satisfactory reliability both in the Dutch (α = .76) and Chinese sample (α = .87). Partially strong measurement invariance of the Engagement scale across the two samples was found, χ2 (95) = 309.85, p < .001, RMSEA = .058, 90% CI [.051, .066], CFI = .902, SRMR = .063 for the final model, indicating that the scale can be used for cross-cultural comparisons.

Analyses

Mplus version 7.31 was used for data analysis (Muthén & Muthén, Citation1998–2012). Models were estimated with Robust Maximum Likelihood Estimation. We used the ‘Type = Complex’ option (a sandwich estimator used for standard error computations) to account for nesting (i.e. students nested within classrooms). Missing values (<1.1% per variable) were treated with Full Information Maximum Likelihood.

Data were analysed in three steps. First, as the 7-point scale was assumed to be continuous, we calculated intra-class correlation coefficients (ICCs) for both Dutch and Chinese samples to determine whether the interrater reliability of STRD-dimensions can be considered good (.60 ≤ ICCs ≤ .74) or excellent (.75 ≤ ICCs ≤ 1; Cicchetti et al., Citation2006). Second, multiple group modelling was used to test the measurement invariance of STRDs across countries. We tested the fit of a one-factor model where all the drawing dimensions were loaded on a factor representing STR-quality. We tested this unidimensional model instead of a two-factor model of positive and negative STRs, as there is only one indicator for positive STRs (Pride/Happiness). To test for measurement invariance, changes in fit indices were calculated by subtracting the fit indices of a more constrained model from those of a less constrained model. Given the large total sample (N = 1326), model equivalence could be indicated by ∆CFI ≥-.010, ∆RMSEA ≤.015, and ∆SRMR values ≤.030 for weak invariance and ≤.010 for strong invariance (Chen, Citation2007). In the case full measurement invariance was not supported, modification indices would be used to free equality constraints. Third, Pearson’s correlations (r) among the factor scores of STRD-dimensions, student-reported Closeness and Conflict on the SPARTS, and Engagement were calculated for the Dutch and Chinese sample, to examine the convergent and predictive validity of STRDs. Correlations were then transferred to Z-scores using Fisher’s r to Z transformation (Fisher, Citation1921), to test whether these correlations were equal across Dutch and Chinese samples.

Results

presents the ICCs, means, standard deviations, and correlations between study variables. ICCs for all dimensions in the Chinese sample were good to excellent (.72 < ICC < .85) and comparable to those in the Dutch sample (.68 < ICC < .84; Cicchetti et al., Citation2006). Correlations of STRD-dimensions with Closeness, Conflict, and Engagement were mostly significant and in the expected directions for both the Dutch (−.26 < r < .30) and Chinese sample (−.23 < r < .26).

Table 1. Interrater-reliabilities (ICCs), means, standard deviations (SDs), and correlations between study variables.

Second, model fit and model comparison statistics of the models testing the measurement invariance of STRDs are presented in . Fit indices supported weak invariance, Δχ2 (6) = 26.52, p < .001; Δ RMSEA = −.002, ΔCFI = −.005, ΔSRMR = .027, but strong invariance did not hold. Partially strong invariance was established by freeing equality constraints on the intercepts of Vulnerability, Anger/Tension, and Role Reversal, Δχ2 (3) = 14.91, p = .002; Δ RMSEA = −.001, Δ CFI = −.003, ΔSRMR = .006. Vulnerability had a higher intercept in the Chinese sample, but Anger/Tension and Role Reversal had a higher intercept in the Dutch sample. A visual presentation can be found in .

Table 2. Measurement Invariance of student–teacher relationship drawing of elementary school students across the Netherlands and China.

Figure 1. Multiple-Group model of partially strong invariance of student-teacher relationship (STR) Drawings across the dutch and chinese samples.

Figure 1. Multiple-Group model of partially strong invariance of student-teacher relationship (STR) Drawings across the dutch and chinese samples.

Third, correlations of the latent factor score of STRDs with Closeness, Conflict, and Engagement are provided in . In both samples, the STRD-score had a positive association with Closeness (rDutch = .29; rChinese = .26), and these associations were equal across samples (Z = 0.58, p = .280). The STRD-score was negatively associated with Conflict in both samples (rDutch = −.32; rChinese = −.27), and these associations were equal across samples as well (Z = -0.99, p = .162). Furthermore, the STRD-score was positively associated with Engagement (rDutch = .23; rChinese = .21), with associations being equally strong across the Dutch and Chinese samples (Z = 0.38, p = .353).

Table 3. Correlations between the factor score of student–teacher relationship drawings(STRDs), relationship questionnaire (SPARTS), and students’ engagement.

The association between the STRD-score and Engagement appeared to be lower than the associations between Closeness (Conflict) and Engagement. This is not surprising, however, as Closeness, Conflict and Engagement were measured with questionnaires and their associations may be higher due to common method bias (Podsakoff et al., Citation2003). To examine whether the STRD-score still had a unique association with Engagement above and beyond Closeness and Conflict, we performed an ad-hoc multilevel regression analysis. The results showed that the STRD-score had a significantly positive association with Engagement (β = .06, p = .012), after controlling for the associations between Closeness (Conflict) and Engagement. This lends support for the incremental validity of STRDs.

Discussion

The present study examined the psychometric properties of STRDs in a cross-cultural context. Interrater reliabilities of all STRD-dimensions were good to excellent in both samples, indicating that the coding system applies to coding drawings in Eastern contexts as well. Moreover, we found partially strong measurement invariance of STRD-dimensions across Dutch and Chinese samples, which suggests that STRDs can be used for cross-cultural comparisons of students’ relationship perceptions (Byrne et al., Citation1989). Notably, the intercept of Vulnerability was higher and the intercepts of Anger/Tension and Role Reversal were lower in the Chinese sample than in the Dutch sample. Future research may further investigate possible reasons for the varying intercepts, and caution should be warranted when interpreting cultural differences in these drawing dimensions.

Students’ relationship experiences as measured with drawings appeared to be moderately correlated with their reports on the relationship questionnaire and engagement, providing further support for the validity of STRDs in Eastern contexts. Furthermore, we found that students’ relationship perceptions measured by drawings showed a unique association with students’ engagement that is different from relationship perceptions measured by a questionnaire. This finding further supports the theoretical assumption that relationship drawings and relationship questionnaires tap different aspects of STR quality (cf., Harrison et al., Citation2007).

Relationship questionnaires and STRDs also have their own advantages and disadvantages. Compared to relationship questionnaires, STRDs are less prone to cultural differences in interpreting verbal statements, and provide a more detailed picture of STRs on seven dimensions (Pinto & Bombi, Citation2008; Zee et al., Citation2020). Questionnaires also suffer more from social desirability bias (e.g. students in Eastern countries may refrain from reporting high conflict with teachers; Chen et al., Citation2019), whereas STRDs may be less influenced by such a bias. Nevertheless, training the raters and coding STRDs is far more time-consuming than administering questionnaires. There are also certain degrees of subjectivity in coding STRDs, as raters’ interpretations of drawings may be affected by their own backgrounds and experiences. However, every drawing was coded by two raters in the present study to reduce subjectivity in coding, and the high ICCs also indicated that the drawings were coded in a relatively consistent way. Considering the advantages and disadvantages of both instruments, future research may combine (or choose between) STRDs and relationship questionnaires based on the research aim and research design.

The present study also has limitations. First, we only focused on third-to-sixth graders in upper elementary schools. Although previous studies in Western contexts showed that STRDs can be used to measure kindergarteners’ relationship perceptions (Harrison et al., Citation2007; McGrath et al., Citation2017), we do not know whether STRDs can be applied to students from other age groups as well. Future research may therefore explore the application of STRDs among younger (e.g. first- and second-grade students) and older children (e.g. secondary school students).

Second, we were not able to investigate the test-retest reliability or other types of validity (e.g. discriminant validity, criterion validity) of STRDs. Future research may employ STRDs across several time points to examine the test-retest reliability, and include variables that theoretically differ from STRs (e.g. language skills and social competence; Solheim et al., Citation2012) to test the discriminant validity of STRDs.

Based on our findings, some implications for future research and practice should be mentioned. First, it is recommended to use STRDs in cross-cultural comparisons of STR-quality. As STRDs reflect different aspects of STR-quality than relationship questionnaires, it would be interesting for future researchers to explore whether previously found cultural differences in STRs as measured by students’ questionnaire reports (Chen et al., Citation2019) would still exist when measured with STRDs. In addition, more frequent use of STRDs in school practice is also encouraged, both in Western and Eastern countries. STRDs may help school practitioners to identify students who need guidance to improve STRs. This is especially true for students who find it difficult to talk about negative STRs (e.g. shy students and students in Eastern countries). Although replication of our findings is necessary, STRDs can be considered as an appropriate method for measuring students’ relationship perceptions in cross-cultural contexts. Future research may thus profit from using STRDs for cross-cultural comparisons of students’ relationship perceptions.

Supplemental material

Supplemental Material

Download Zip (54.9 KB)

Acknowledgments

This work was supported by the CSC scholarship offered by the China Scholarship Council in collaboration with the University of Amsterdam.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author, Mengdi Chen. The data are not publicly available as they contain sensitive and personal information that could compromise the privacy of the research participants.

Supplementary material

Supplemental data for this article can be accessed here.

Additional information

Funding

This work was supported by the China Scholarship Council [CSC scholarship in collaboration with the Universi].

References