1,654
Views
5
CrossRef citations to date
0
Altmetric
Research Article

A change in strategy: Static emotion recognition in Malaysian Chinese

, & | (Reviewing Editor)
Article: 1085941 | Received 25 May 2015, Accepted 16 Aug 2015, Published online: 14 Sep 2015

Abstract

Studies have shown that while East Asians focused on the center of the face to recognize identities, participants adapted their strategy by focusing more on the eyes to identify emotions, suggesting that the eyes may contain salient information pertaining to emotional state in Eastern cultures. However, Western Caucasians employ the same strategy by moving between the eyes and mouth to identify both identities and emotions. Malaysian Chinese have been shown to focus on the eyes and nose more than the mouth during face recognition task, which represents an intermediate between Eastern and Western looking strategies. The current study examined whether Malaysian Chinese continue to employ an intermediate strategy or shift towards an Eastern or Western pattern (by fixating more on the eyes or mouth respectively) during an emotion recognition task. Participants focused more on the eyes, followed by the nose then mouth. Directing attention towards the eye region resulted in better recognition of certain own- than other-race emotions. Although the fixation patterns appear similar for both tasks, further analyses showed that fixations on the eyes were reduced whereas fixations on the nose and mouth were increased during emotion recognition, indicating that participants adapt looking strategies based on their aims.

Public Interest Statement

While East Asians focus on the center of the face to recognize identities, they adapt their strategy by focusing more on the eyes when identifying emotions. However, Western Caucasians use the same strategy by moving between the eyes and mouth to identify both identities and emotions. Malaysian Chinese people have been shown to focus on the eyes and nose more than the mouth when recognizing faces. The current study examined Malaysian Chinese adults’ looking strategies during an emotion recognition task. To recognize emotions, Malaysian Chinese looked more at the eyes, followed by the nose then mouth. Directing attention towards the eyes resulted in better recognition of certain own- than other-race emotions. Although they appear to use similar strategies to recognize emotions and identities, they actually looked less at the eyes and more at the nose and mouth when identifying emotions, indicating that people adapt looking strategies based on their aims.

Competing interests

The authors declare no competing interest.

1. Introduction

The mutual understanding of emotion through the exchange of facial expressions is an important foundation of human social interaction. Facial expressions not only bear information about an individual’s emotional state, but also convey non-verbal cues of one’s motivations, social intentions, and interpersonal traits (Darwin, Citation1872; Knutson, Citation1996). Therefore, the ability to recognize and react appropriately to facial expressions is vital in our daily social encounters. Although facial expressions of emotion are long thought to be universal (Darwin, Citation1872; Ekman & Friesen, Citation1971; Matsumoto & Willingham, Citation2009), cross-cultural studies have found conflicting evidence that challenges the universality hypothesis (Biehl et al., Citation1997; Birdwhistell, Citation1970; Klineberg, Citation1940; Moriguchi et al., Citation2005). Researchers have attempted to reconcile the two extreme ends of the argument by differentiating elements of facial expressions that are universal and culture-specific (Ekman, Citation1970).

Early studies on facial expressions suggested that all humans display the six basic emotional states (“happy,” “surprise,” “fear,” “disgust,” “anger,” and “sad”) using the same facial movements that are biologically and evolutionarily determined (Darwin, Citation1872). It was argued that the meanings of facial expressions would have diverged if they were merely cultural traits that are socially learned. In support of the universality hypothesis, numerous studies involving diverse populations, such as adults and children in preliterate cultures (Ekman & Friesen, Citation1971) and blind athletes (Matsumoto & Willingham, Citation2009), provide evidence that the display and recognition of emotion is not learned and modeled from observation but is genetically coded for all humans.

On the other hand, researchers who hold a culture-specific view believe that facial expressions of emotion, which act as a means of communication, are composed of units and are organized similar to the components of a spoken language (Birdwhistell, Citation1970). Although certain patterns of behavior, such as laughing and crying, are universal, the expressions of emotion like anger and fear are not. Hence, researchers who subscribe to this viewpoint argue that facial expressions and body movements must be socially learned according to the structure of the individual society (Klineberg, Citation1940). Cross-cultural studies using various techniques, such as subjective ratings (Biehl et al., Citation1997) and brain imaging (Moriguchi et al., Citation2005), have provided evidence in support of the culture-specific hypothesis.

To reconcile the two extreme ends of the argument, researchers have differentiated elements of facial expressions that are universal and culture-specific, proposing that the expressions of emotion might vary across cultures as a result of three main factors (Ekman, Citation1970). Firstly, certain stimuli are learned to be the elicitors of specific emotions in different cultures; therefore, the comparison of reactions to the same event in two cultures may not be accurate, as it does not necessarily elicit an identical emotion. Secondly, different cultures may have different display rules that govern one’s display of emotions in particular social settings. Lastly, the behavioral consequences of an emotion may vary across cultures; thus, the body postures and movements following a facial expression may not be comparable across cultures. Ekman (Citation1970) argued that although the evoking stimuli, display rules, and behavioral consequences of emotion may differ across cultures, the facial expressions of the primary emotions are universal to mankind.

To better understand the underlying attentional strategies that observers employ to recognize facial expressions of emotion, more recent research has focused on investigating how observers extract visual information by utilizing eye tracking methodologies. Research has shown that observers shift their visual attention to a particular element (e.g. features of a face or an object in a scene) only when it is believed to be useful and crucial for perception; various factors, such as visual saliency or the amount of detail an element contains, are often disregarded when they do not contain meaningful information (Hayhoe & Ballard, Citation2005; Yarbus, Citation1967). Because the degree of attention given to an element is determined by the deemed importance of the element and the nature of the task, tracking an observer’s eye movements would allow researchers to identify the elements that attract one’s attention and provide insight into an observer’s thought processes and cognitive goals (Hayhoe & Ballard, Citation2005; Yarbus, Citation1967).

When perceiving human faces, for example, attention is directed mainly towards internal features, which provide individuating information about a face (i.e. the eyes, nose and mouth). However, a slight change in stimuli, such as facial expressions of emotion, may elicit different eye movement strategies. Yarbus (Citation1967) found that when observers were presented with a portrait of a girl holding a neutral expression, observers’ attention was drawn towards her expressive eyes more than the nose and mouth; however, when perceiving a photo of a smiling girl, attention was directed towards the mouth.

The nature of the task may also alter observers’ eye movement strategies. For instance, when given different sets of instructions prior to viewing the same photo, an observer focused on different elements of the photo that were essential to the task at hand, providing evidence that eye movements may reflect an observer’s cognitive goals (Yarbus, Citation1967). Moreover, participation in active tasks (e.g. visual search and reading text) rendered shorter fixation durations and larger saccadic eye movements than inactive tasks (e.g. passive viewing of natural scenes and simple patterns; Andrews & Coppola, Citation1999). The type of stimuli presented also affected eye movements, with passive viewing of complex visuals, such as natural scenes, rendering shorter fixation durations and larger saccadic eye movements than passive viewing of simple patterns and darkness without visual stimulation, implying that visual stimulation without complexity does not alter the default eye movements employed (Andrews & Coppola, Citation1999).

Recent cross-cultural face-processing studies have found that East Asian observers tended to focus in the nose area when performing a face recognition task; however, observers focused on the eye region alone when asked to identify emotion rather than identity, suggesting an adaptation of strategy (Blais, Jack, Scheepers, Fiset, & Caldara, Citation2008; Jack, Blais, Scheepers, Schyns, & Caldara, Citation2009). This change of eye movement strategy suggests that certain emotions may be expressed differently in the East, leading observers to have learned to adopt a different fixation pattern to recognize emotions. Behavioral results showed that East Asian observers’ lack of focus on the mouth may have contributed to the misrecognition of certain emotions such as “fear” and “disgust,” which were consistently confused with “surprise” and “anger” (Jack et al., Citation2009). On the other hand, Western Caucasian observers appear to employ a similar strategy for both face recognition and emotion recognition tasks by moving between the eyes and mouth to recognize identity and emotion (Blais et al., Citation2008; Jack et al., Citation2009). Western Caucasian observers may not have altered their fixation pattern in the emotion recognition task possibly because fixating on the expressive regions of the face (i.e. the eyes and mouth) was an optimal strategy to recognize emotions as well as recognize identities. Behavioral results showed that the even distribution of fixations across faces enabled Western Caucasians to recognize all facial expressions of emotion with relatively high accuracy.

Similar with the looking patterns identified among Eastern and Western observers, behavioral studies have also revealed that Japanese and American observers rely on different facial cues to recognize expressions of “happy” and “sad” in illustrated faces and edited facial expressions from real people (Yuki, Maddux, & Masuda, Citation2007). In both set of images, cues from the eyes and mouth were manipulated to produce faces with different combinations of happy and sad features (e.g. happy eyes and sad mouth or sad eyes and happy mouth). Japanese observers weighted cues displayed in the eyes more than Americans whereas Americans weight cues in the mouth more when recognizing expressions. The author proposed that people learn to rely on different facial cues because of the cultural differences in the facial expressions of emotion. In the West where individualism is embraced, people are encouraged to express their true feelings as it reflects the acceptance of one’s true self (Heine, Lehman, Markus, & Kitayama, Citation1999). In contrast, a collectivist culture, like the East, values the control of emotion to maintain harmonious relationships with others (Friesen, Citation1972; Kitayama, Markus, & Kurokawa, Citation2000; Uchida, Kitayama, Mesquita, Reyes, & Morling, Citation2008). These cultural differences in one’s social environment may affect how emotions are expressed. Furthermore, it has been shown that the muscles around the eyes are more difficult to control as compared with the muscles around the mouth, suggesting that the eyes may be a more reliable cue to identifying one’s true emotions (Ekman, Citation1992; Ekman, Friesen, & O’Sullivan, Citation1988; Mai et al., Citation2011). Hence, in a culture where emotions may be suppressed, Japanese observers learn to focus on the eye region as it provides more diagnostic information of one’s true emotions, while in the American culture where the expression of true emotions is encouraged, Americans choose to focus on the most expressive region of the face—the mouth.

The research reviewed so far suggests that there are distinct Eastern and Western eye movement strategies employed in face recognition and emotion recognition tasks; strategies that may be affected by their respective cultures (Blais et al., Citation2008; Jack et al., Citation2009). Face recognition studies investigating the effects of cultural environment on populations such as British-born Chinese and Malaysian Chinese have revealed that exposure to and familiarity with another culture may foster an intermediate fixation pattern that enables observers to recognize both own- and familiar other-race faces (Kelly et al., Citation2011; Tan, Stephen, Whitehead, & Sheppard, Citation2012). This raises the possibility that intermediate looking strategies may also be used for identifying emotions in such societies. The current study aimed to examine Malaysian Chinese observers’ sensitivity and eye movement strategy in recognizing East Asian and Western Caucasian facial expressions of emotion. Although Malaysia is located in the East, it is a strongly multicultural country with over 200 years of history with the West (Advameg, Citation2011). Today, Western influence can still be traced in the education system (Gaudart, Citation1987) and mass media (Davies, Citation2011; Epstein, Citation2011). If Malaysian Chinese adapt their looking strategy and follow an Eastern looking pattern, participants would focus mainly on the eye region when recognizing emotions. This might also result in better identification for own- than for other-race faces as eyes may be a more accurate cue for recognizing emotions in East Asian faces (Yuki et al., Citation2007). However, because of the lack of attention to the mouth, Malaysian Chinese might be poor at recognizing certain emotions, like “fear” or “disgust” (Ekman & Friesen, Citation1971). On the other hand, if Malaysian Chinese are primarily influenced by Western culture, we might see a combination of Eastern and Western strategies, with fixations landing on the eyes and mouth more than the nose. A combination of both strategies may enable participants to identify expressions from own- and other-race faces with equal success.

Additionally, the eye tracking results of the current emotion recognition study will be compared with a previous face recognition study utilizing similar procedures (Tan et al., Citation2012) to investigate whether Malaysian Chinese participants employ similar or different eye movement strategies to perform the two tasks. Research has shown that observers adapt their eye movement strategies based on the nature of the task. Face-processing studies have also identified different fixation patterns for face recognition and emotion recognition tasks in both East Asian and Western Caucasian populations (Blais et al., Citation2008; Jack et al., Citation2009). Therefore, we might expect to see a change in strategy for the two face-processing tasks.

2. Methods

2.1. Participants

To collect photographs for the emotion recognition study, 36 “models” (term used as short-hand—the volunteers were not trained models) were recruited. Models were 11 male, 8 female East Asian (mean age 20.75 years) and 7 male, 10 female Western Caucasian (mean age 30.75 years) students or staff in the University of Nottingham Malaysia Campus who participated voluntarily.

In the validation phase, 50 University of Nottingham Malaysia Campus students (22 males and 28 females with a mean age of 23.2 years) were asked to categorize the facial expressions to ensure that the photographs portrayed the emotions assigned. Participants were of different racial backgrounds (e.g. African, East Asian, Western Caucasian, etc.), who participated voluntarily.

Twenty-six Malaysian Chinese students attending the University of Nottingham Malaysia Campus (9 males, 17 females, mean age 21.35 years) participated in the main emotion recognition study. All participants had not lived outside of Malaysia for more than three years. All participants had normal or corrected vision, and were given a bar of chocolate for their participation.

Participants in all three phases were selected randomly and were recruited by opportunity sampling and snowball sampling. Written informed consent was obtained from participants in all three phases and the protocol was approved by the University of Nottingham Malaysia Campus, Faculty of Science Ethics Committee.

2.2. Materials

Prior to the main experiment, 36 sets of images displaying Ekman’s six universal emotions (“happy,” “surprise,” “fear,” “disgust,” “anger,” and “sad”; 1970) plus “neutral” were collected. The images were captured using the Canon EOS 550D digital SLR with an aperture of 5.0f, shutter speed of 1/100 s, and an ISO of 400. Multiple photographs were taken of each emotion for each model and a total of 369 images, displaying one of seven emotions, were used in the validation phase. All images were photographed in a Munsell N5 neutral gray painted lighting booth illuminated with d65 fluorescent tubes, in high-frequency fixtures to reduce the effects of flicker, with no other lighting present in the room (Verivide, UK). Images were aligned to the eyes and cropped around the face using Psychomorph software (Tiddeman & Perrett, Citation2001). The images were 270 × 333 pixels in size and were presented at a distance of 60 cm on a HP desktop PC using E-Prime and participants were asked to categorize the facial expressions by pressing a corresponding response key on the keyboard.

Participants’ responses in the validation phase were compiled and the percentage of accuracy for each image was calculated. Eight identities whose emotions were identified with the highest percentage of accuracy were selected for the main experiment, totaling 56 images (i.e. 2 male, 2 female East Asians, mean age 21 years, and 2 male, 2 female Western Caucasians, mean age 28.5 years, each displaying 7 emotions). All eight identities had an average accuracy of 70 percent or higher. Among the 56 images, participants in the validation phase were best at recognizing “happy” (96.88%), followed by “sad” (88.75%), “neutral” (86.88%), “surprise” (81.25%), “disgust” (78.75%), “anger” (74.38%), and “fear” (65%).

For the main experiment, a Tobii T60 eye-tracker was used to record participants’ eye movements. Both eyes were tracked at a data sampling rate of 60 Hz with high accuracy (0.5°) and drift compensation (less than 0.3°). The images were presented at a distance of approximately 60 cm on the Tobii eye tracker, which is a 17 in TFT monitor with a screen resolution of 1,280 × 1,024 pixels, using Tobii Studio software, version 2.3.

2.3. Procedure

On each of the 56 trials, a central fixation cross was presented for one second, followed by a face presented pseudorandomly in one of the four quadrants of the computer screen to avoid fixation bias. The face stimulus displaying one of the seven expressions was presented for 5 seconds, after which participants were required to identify the most appropriate emotion via a seven-alternative forced-choice procedure by pressing a corresponding response key on the keyboard. Each response was followed by the central fixation cross, which preceded the next face stimulus.

2.4. Data analyses

For the purpose of data analysis, each of the seven emotions was approached separately using signal detection theory. For instance, while coding for “happy,” “happy” would be considered as the target emotion while the other six emotions (i.e. “neutral,” “surprise,” “fear,” “disgust,” “anger,” and “sadness”) would be the non-target emotions. Participants’ responses were encoded to four possible outcomes: hits (i.e. correctly identifying a target emotion as the target emotion), misses (i.e. incorrectly identifying a target emotion as a non-target emotion), false alarms (i.e. incorrectly identifying a non-target emotion as the target emotion), and correct rejections (i.e. correctly identifying a non-target emotion as a non-target emotion). As an example, when coding for “happy,” the correct identification of a “happy” image as “happy” would be coded as a hit while incorrectly identifying any of the six other emotions as “happy” would be a false alarm. This procedure was applied to all seven emotions. Additionally, bias was calculated using the ratio of the number of times an emotion label was chosen to the number of times the emotion label was the correct answer; values greater than 1 indicate a liberal criterion whereas values less than 1 indicate a conservative criterion.

To determine participants’ recognition sensitivity, a-prime (A′) values were calculated. A′ is a non-parametric equivalent of d-prime which indexes participants’ sensitivity to correct facial expressions of emotion, taking into account both hits and false alarms. A′ was used to account for any bias of participants to pick a certain emotion label over others using the formulae below, where H represents hits and FA represents false alarms (Snodgrass & Corwin, Citation1988):

If H > FA, A′ = 0.5 + [(H − FA) (1 + H − FA)]/[4H (1 − FA)]

If FA > H, A′ = 0.5 − [(FA − H) (1 + FA − H)]/[4FA (1 − H)]

Tobii Studio software was utilized to process raw data directly from the eye-tracker. The total number of fixations landing on three predefined areas (i.e. eyes, mouth, and nose) was calculated using an area of interest (AOI) analysis, see Figure . Fixations were defined as two or more consecutive samples falling within a 35 pixel with a minimum fixation duration of 60 ms. Each participant had to complete 56 trials for 5 seconds each. To ensure the validity of the eye-tracking data, only participants who had an average number of fixations equal to or more than five were included in the data-set. One female participant did not meet the criteria and was excluded from the eye-tracking analyses.

Figure 1. The predefined areas of interest (AOI) used to analyze eye gaze.

Note: Composite images are shown for illustration purposes. Real faces were used in the actual experiment.
Figure 1. The predefined areas of interest (AOI) used to analyze eye gaze.

Additional analyses were conducted to compare Malaysian Chinese participants’ total fixation count landing on East Asian and Western Caucasian eyes, mouth, and nose during the emotion recognition task, with similar data taken from a face recognition task (Tan et al., Citation2012). The procedure for the face recognition task was similar to the task described here, except that participants were asked to identify if they had previously seen a series of African, East Asian and Western Caucasian faces presented on the screen of a Tobii T60 eye tracker for 5 seconds each, instead of being asked to identify the emotional facial expression. Eye-tracking data for African faces were not included in the current analysis in order to be consistent with the emotion recognition task.

3. Results

3.1. Preliminary analyses

Prior to the main analyses, a one-way analysis of variance was conducted on the models’ age. There was a significant difference of age between races, F(1, 7) = 7.76, p = .032. Western Caucasian models were significantly older than East Asian models. To further assess the relationship between models’ age and the total recognition accuracy from 26 participants, a Pearson product–moment correlation coefficient was computed. There was no significant correlation between the two variables, r(56) = −.26, p = .534, indicating that the models’ age did not have an impact on participants’ recognition performance.

A preliminary 2 (race of face: East Asian or Western Caucasian) × 7 (emotions: “neutral,” “happy,” “surprise,” “fear,” “disgust,” “anger,” and “sad”) × 2 (gender of participants: male or female) ANOVA was conducted on participants’ sensitivity (A′) in recognizing facial expressions of emotion. This gave rise to no main effect of gender, F(1, 24) = .14, p = .715, or interactions between race and gender, F(1, 24) = .03, p = .848, and emotion and gender, F(6, 19) = .67, p = .675. Therefore, gender was removed as a variable in the main analysis.

3.2. Recognition sensitivity

The main analysis employed a simpler 2 (race of face: East Asian or Western Caucasian) × 7 (emotions: “neutral,” “happy,” “surprise,” “fear,” “disgust,” “anger,” and “sad”) ANOVA. Main effects of both race, F(1, 25) = 4.95, p = .035, and emotion, F(3.5, 87.41) = 47.18, p < .001, were found, both Greenhouse–Geisser corrected (see Figure ). Participants recognized East Asian expressions better than Western Caucasian ones. Post hoc tests using Bonferroni pairwise comparisons revealed that participants were best at recognizing “happy” (p = .001 for “sad” and p < .001 for all five other comparisons) while “fear” was recognized less well than all other expressions (p < .001 for all six comparisons). “Fear” was not only the least recognized emotion, but also the only one consistently confused with another emotion; in 50% of the trials, participants mistook “fear” for “surprise,” see Table for hit rates, false alarm rates, and bias.

Figure 2. Malaysian Chinese participants’ sensitivity in recognizing East Asian and Western Caucasian expressions.

Note: Error bars report standard errors of the mean. Participants were best at recognizing “happy” while “fear” was recognized significantly less well than all other expressions.
Figure 2. Malaysian Chinese participants’ sensitivity in recognizing East Asian and Western Caucasian expressions.

Table 1. Hit rates (HR), false alarm rates (FAR), and bias in recognizing East Asian and Western Caucasian facial expressions of emotion

There was also an interaction between race and emotion, F(3.68, 92.1) = 6.01, p < .001, Greenhouse–Geisser corrected. Paired samples t-tests revealed that “happy,” t(25) = 2.65, p = .01 and “surprise,” t(25) = 4.09, p < .001 were recognized significantly better for East Asian (“happy” M = .97, SD = .01; “surprise” M = .93, SD = .03) than Western Caucasian (“happy” M = .96, SD = .01; “surprise” M = .84, SD = .11) faces, while the remaining emotions were recognized equally well across races.

3.3. Total fixation count

Linear mixed modeling was conducted on the total fixation count (i.e. the accumulated number of fixations) for each feature (eyes, mouth, and nose) for each race of face (East Asian and Western Caucasian) posing each emotion (“neutral,” “happy,” “surprise,” “fear,” “disgust,” “anger,” and “sad”). Participant ID, face ID, emotion, race of face, and feature of face were included as factors. All main effects and the two- and three-way interactions between emotion, race of face, and feature of face were included in the model (dependent variable = total fixation count; fixed factors = emotion, race of face, and feature of face; random factor = face ID nested within race of face). Participant ID and face ID were included as subjects and face ID was nested within race of face as a random factor to avoid pseudoreplication. There was a significant main effect of emotion, F(6, 4,152) = 5.14, p < .001, and feature, F(2, 4,152) = 610.56, p < .001, and an interaction between emotion and feature, F(12, 4,152) = 5.14, p < .001, see Figure . For emotion, post hoc tests using Bonferroni pairwise comparisons revealed that participants looked at “fearful” faces significantly more than “neutral” (p = .032) and “happy” (p < .001) faces, and there was a trend towards more fixations for “fearful” than “surprised” faces (p = .052). There were also more fixations directed at “angry” than “happy” faces (p = .018) and a trend towards more fixations for “sad” than “happy” faces (p = .081). As for feature, post hoc tests using Bonferroni pairwise comparisons showed that participants fixated on the eyes significantly more frequently than the nose (p < .001) and mouth (p < .001), and the nose significantly more than the mouth (p < .001).

Figure 3. Average total fixation count for the eyes, mouth, and nose during the emotion recognition task.

Note: Error bars report standard errors of mean. Participants fixated most frequently on the eyes, followed by the nose then mouth.
Figure 3. Average total fixation count for the eyes, mouth, and nose during the emotion recognition task.

To examine the interaction between emotion and feature, separate linear mixed models were conducted for each of the seven emotional facial expressions. Data were split by emotion. Participant ID, face ID, and feature were included as factors (dependent variable = total fixation count; fixed factor = feature of face). Participant ID and face ID were included as subjects to avoid pseudoreplication. There was a main effect of feature for all seven emotions, F(2, 597) = 132.94, p < .001 for “neutral,” F(2, 597) = 50.38, p < .001 for “happy,” F(2, 597) = 56.47, p < .001 for “surprise,” F(2, 597) = 101.46, p < .001 for “fear,” F(2, 597) = 86.67, p < .001 for “disgust,” F(2, 597) = 105.96, p < .001 for “anger,” and F(2, 597) = 105.60, p < .001 for “sad” faces. Post hoc tests using Bonferroni pairwise comparisons revealed that for “neutral,” “surprise,” “fear,” and “anger” faces, participants made significantly more fixations on the eyes than the nose (p < .001 for “neutral” and “fear;” p = .004 for “surprise” and p = .003 for “anger”) and mouth (p < .001 for all four emotions), and the nose more than the mouth (p < .001 for all four emotions). For “happy,” “disgust,” and “sad” faces, participants tended to fixate more on the eyes (p < .001 for all three emotions) and nose (p < .001 for all three emotions) than the mouth; however, the total number of fixations landing on the eyes and nose did not differ significantly (p = .65 for “happy” faces, p = .679 for “disgust” faces, and p = .376 for “sad” faces). Further linear mixed models were conducted for each of the three facial features. Participant ID, face ID, and emotion were included as factors (dependent variable = total fixation count; fixed factor = emotion). Participant ID and face ID were included as subjects to avoid pseudoreplication. There was a main effect of emotion for the eyes, F(6, 1,393) = 5.70, p < .001, mouth, F(6, 1,393) = 4.95, p < .001, and nose F(6, 1,393) = 4.15, p < .001). Post hoc tests using Bonferroni pairwise comparisons showed that there were significantly more fixations landing on the eyes for “neutral” (p = .003), “fear” (p < .001), and “anger” (p = .01) than “happy” faces, more fixations landing on “fear” and “neutral” than “disgust” faces (p = .002), and a trend towards more fixations landing on “fear” than “surprise” faces (p = .083) and “neutral” than “disgust” faces (p = .087). For the mouth, there were significantly more fixations landing on “surprise” than “neutral” (p < .001) and “anger” (p = .012) faces, more fixations landing on “fear” than “neutral” (p = .003) faces, and a trend towards more fixations landing on the mouth for “surprise” than “sad” (p = .059) faces and “fear” than “anger” faces (p = .089). As for the nose, participants fixated significantly more on “disgust” than “neutral” (p = .033), “happy” (p = .003), and “surprise” (p = .019) faces, and a trend towards more fixation counts on “sad” than “happy” faces (p = .075).

To investigate whether Malaysian Chinese participants shift their eye movement strategies according to the nature of the task, additional linear mixed modeling analyses were conducted to compare the total fixation count landing on each feature (eyes, mouth, and nose) for each race of face (East Asian and Western Caucasian) when Malaysian Chinese participants performed the emotion recognition and face recognition tasks (data for the face recognition task was taken from Tan et al., Citation2012). Participant ID, face ID, task, race of face, feature of face were included as factors. All main effects and the two- and three-way interactions between task, race of face, and feature of face were included in the model (dependent variable = total fixation count; fixed factors = task, race of face, and feature of face; random factors = participant ID nested within task and face ID nested within race of face). Participant ID and face ID were included as subjects and participant ID was nested within task and face ID was nested within race of face to avoid pseudoreplication. There was a main effect of task, F(1, 43.96) = 4.41, p = .041, with more fixations for the emotion recognition than face recognition task, and feature, F(2, 6,532.05) = 956.73, p < .001. Post hoc tests using Bonferroni pairwise comparisons showed that participants fixated on the eyes significantly more than the nose (p < .001) and mouth (p < .001), and the nose significantly more than the mouth (p < .001). However, there was no main effect of race, F(1, 41.64) = .33, p = .57, suggesting that participants employed a similar fixation pattern when perceiving East Asian and Western Caucasian faces in both tasks.

Besides the two main effects, there was also an interaction between task and feature, F(2, 6,532.05) = 2.99, p = .050, see Figure . To examine the interaction between task and feature, separate linear mixed models were conducted for each of the two tasks. Participant ID, face ID, and feature were included as factors (dependent variable = total fixation count; fixed factors = feature of face). Participant ID and face ID were included as subjects to avoid pseudoreplication. There was a main effect of feature for both emotion recognition, F(2, 4,197) = 599.57, p < .001, and face recognition, F(2, 2,397) = 407.68, p < .001, tasks. Post hoc tests using Bonferroni pairwise comparisons revealed that for both tasks, participants made significantly more fixations on the eyes than the nose (p < .001 for both tasks) and mouth (p < .001 for both tasks), and the nose more than the mouth (p < .001 for both tasks). Further linear mixed models were conducted for each of the three facial features. Participant ID, face ID, and task were included as factors (dependent variable = total fixation count; fixed factor = task). Participant ID and face ID were included as subjects to avoid pseudoreplication. There was a main effect of task for all three features, with high significance level for the nose, F(1, 2,198) = 31.85, p < .001, and mouth, F(1, 2,198) = 13.74, p < .001, and a marginal significance for the eyes, F(1, 2,198) = 3.92, p = .048. Although participants fixated more on the emotion than face recognition task for all three features, participants tended to shift their attention away from the eyes, and towards the nose and mouth in the emotion recognition task (p < .001 for the nose and mouth and p = .048 for the eyes).

Figure 4. Average total fixation count landing on the eyes, mouth, and nose during the emotion recognition and face recognition tasks.

Note: Participants shifted their attention away from the eyes and towards the nose and mouth in the emotion recognition task.
Figure 4. Average total fixation count landing on the eyes, mouth, and nose during the emotion recognition and face recognition tasks.

4. Discussion

The current study explored Malaysian Chinese observers’ eye movement strategy and sensitivity in recognizing East Asian and Western Caucasian facial expressions of emotion. Malaysian Chinese participants were best at recognizing “happy,” whereas “fear” was recognized less well than other emotions. Participants were also better at recognizing East Asian than Western Caucasian expressions, particularly for “happy” and “surprise.” When recognizing facial expressions, Malaysian Chinese participants fixated mainly on the eyes, followed by the nose then the mouth.

Although the facial expressions of emotion have long been considered an innate, evolved behavior that is universal, cultural differences found in the recognition of facial expressions challenge the universality hypothesis. Recent eye-tracking studies revealed that East Asian and Western Caucasian observers employ different looking patterns when recognizing facial expressions of emotion; East Asians tended to focus on the eye region alone whereas Western Caucasians looked evenly across the face by moving between the eyes and mouth (Jack et al., Citation2009). Another study which created templates of facial features based on individual observers’ expectations of how an emotion may be expressed, found similar results (Jack, Caldara, & Schyns, Citation2011). More specifically, examination of the components of each face’s internal representation showed that East Asians consistently preferred the eye region while Western Caucasians distributed expressive features to the eyebrows and mouth, implying that culture can influence the internal representations of facial expressions of emotion (Jack et al., Citation2011).

Even though Malaysian Chinese observers appear to have employed similar looking patterns for both emotion and face recognition tasks (with fixations landing on the eyes more than the nose then mouth), analyses comparing the total fixation count on East Asian and Western Caucasian eyes, mouth, and nose for both tasks revealed that participants showed a tendency to look less at the eyes and more at the nose and mouth in the emotion recognition task, compared to the face recognition task. This change of strategy is consistent with previous research which found that East Asian observers use different eye movement patterns for these two types of face processing tasks; more specifically, East Asians fixated mostly on the eyes to recognize emotions because they bear diagnostic information of an individual’s true emotion but employed a holistic strategy by focusing on the nose to recognize identities (Blais et al., Citation2008; Jack et al., Citation2009; Yuki et al., Citation2007). However, the nature of the change in strategy observed here for Malaysian Chinese participants was different from that observed in East Asian observers previously—attention was shifted towards the lower part of the face (i.e. the nose and mouth) for recognizing emotions in comparison with identities. Tan et al. (Citation2012) speculated that Malaysian Chinese participants adopted a combination of Eastern and Western eye movement strategies in the face recognition task to better recognize own- and other-race faces. Although Malaysian Chinese shifted their attention towards the nose and mouth in the emotion recognition task, fixations were directed mainly towards the eyes, a strategy that may have enabled Malaysian Chinese to better recognize own- than other-race facial expressions. Previous studies have suggested that observers learn to rely on different facial cues because of the cultural differences in how the emotions are expressed (Yuki et al., Citation2007). East Asians tended to weight cues displayed in the eyes more than Western Caucasian observers possibly because the eye region provides more diagnostic information of another’s true emotions. In Eastern cultures, where the control of emotion to maintain harmonious relationships with others is valued, East Asians learned to suppress their true emotion (Friesen, Citation1972; Kitayama et al., Citation2000; Uchida et al., Citation2008). Therefore, focusing on the eyes, where the muscles are more difficult to control than the muscles around the mouth, may reveal more useful cue to identify East Asians’ true emotions (Ekman, Citation1992; Ekman et al., Citation1988; Mai et al., Citation2011).

Other research has also suggested that certain facial features provide diagnostic information about different facial expressions of emotion (Calder, Young, Keane, & Dean, Citation2000; Ellison & Massaro, Citation1997; Sullivan, Ruffman, & Hutton, Citation2007). More specifically, studies have shown that directing attention towards the eye region may be beneficial for recognizing emotions such as “happy” and “disgust” while focusing on the mouth may be optimal for recognizing emotions such as “fear,” “anger,” and “sadness” (Calder et al., Citation2000; Sullivan et al., Citation2007). In line with these findings, research exploring the relationship between processing orientation and emotion recognition has found that participants were able to recognize emotions significantly better in terms of accuracy and speed of responses when primed with a local, as compared with a global, processing orientation (Martin, Slessor, Allen, Phillips, & Darling, Citation2012). Because specific features provide diagnostic cues about certain facial expressions of emotion, engaging in a local perceptual style by directing attention towards the critical features of a face (e.g. the eyes and the mouth) may be an optimal strategy that is advantageous for emotion recognition (Martin et al., Citation2012; Weston & Perfect, Citation2005).

Behavioral results showed that Malaysian Chinese were able to recognize East Asian expressions better than Western Caucasian ones, although this was only true for “happy” and “surprise.” It is unlikely that this is due to unfamiliarity with Western Caucasian faces, because in our previous study (Tan et al., Citation2012), Malaysian Chinese recognized Western Caucasian faces with equal proficiency as East Asian ones. A more plausible argument might be that due to different cultural interpretation of certain emotions (Ekman & Friesen, Citation1971), for other-race faces, there is a discrepancy between the way expressions were posed and perceivers’ conception of those emotions. Observers may find certain portrayal of emotions more relevant or familiar to their culture, and hence be more proficient in recognizing these expressions. In any case, Malaysian Chinese observers’ equal success in recognizing East Asian and Western Caucasian identities does not appear to extend to interpreting their facial expressions.

“Fear” was identified significantly less well than other emotions. Half of the participants mistook “fear” as “surprise,” a finding similar to Ekman and Friesen’s (Citation1971) study with New Guineans. Ekman and Friesen (Citation1971) speculated that participants confused the two expressions, not because of their inability to distinguish between the two, but because almost always fearful situations are also surprising in their culture (e.g. unexpectedly meeting a hostile member of another village). In addition, a cross-cultural study revealed that although observers from different cultures agreed on the emotions portrayed in each photograph, the levels of agreement between cultures were considerably different (Biehl et al., Citation1997). Japanese observers, for instance, had less agreement on negative expressions like “fear” as compared to other countries whereas Americans agreed less than other countries on “contempt,” suggesting that the semantic or affect meanings of each emotion may vary across culture. Observers also rated the intensity of emotion differently, indicating dissimilarities in either the portrayal or judgment of facial expressions. Therefore, it is probable that Malaysian Chinese had difficulty recognizing “fear” due to the different meaning or judgment of the emotion.

Research exploring gender differences in emotion recognition have found varying results, possibly due to the methodological differences between studies. Some studies have found that females were better in recognizing facial expressions of emotion (e.g. Rotter & Rotter, Citation1988; Thayer & Johnsen, Citation2000), while others did not find a clear female advantage (e.g. Boloorizadeh & Tojari, Citation2013; Sawada et al., Citation2014). In the current research, there was no effect of gender, indicating that both male and female participants performed similarly in the recognition of facial expressions.

The low performance in the recognition of some emotions (e.g. “fear”) might be related to methodological issues. In the current study, an objective coding scheme was not used for validating the faces, but rather a consensus approach was taken (i.e. a group of participants validated the experimental materials to ensure that the models were expressing emotions that are recognizable). Presumably, models have expressed diagnostic information related to the emotion, which allowed participants to correctly identify all emotions. Future studies could aim to utilize an objective system (e.g. the facial action coding system) to categorize the emotions expressed by classifying each facial movement individually.

In conclusion, analyses comparing eye-tracking data for the emotion recognition and face recognition tasks revealed that participants showed a shift in strategy by reducing fixations on the eyes and increasing fixations on the nose and mouth areas in the emotion recognition task compared to the face recognition task, providing further support that eye movement strategies are affected by the observers’ goals and aims. Although a shift in strategy during the emotion recognition task enabled Malaysian Chinese participants to recognize most emotions for both own- and other-race faces with equal proficiency, directing attention mainly towards the eye region (the location of salient information in the face that is relevant to the Eastern culture) resulted in better recognition of certain own-race facial expressions, such as “happy” and “surprise.”

Additional information

Funding

Funding. The authors received no direct funding for this research.

Notes on contributors

Chrystalle B.Y. Tan

We are interested in human perception and attention in social situations. Using eye-tracking methodologies, we investigate how observers encode and interpret visual information when performing various face-processing tasks, such as face recognition and emotion recognition. We are particularly interested in how individuals from non-western cultures process faces and the possibility that there may be cultural differences in visual strategies as well as face perception performance.

References

  • Advameg. (2011). Malaysia. Countries and their cultures. Retrieved October 10, 2011, from www.everyculture.com/Ja-Ma/Malaysia.html
  • Andrews, T. J., & Coppola, D. M. (1999). Idiosyncratic characteristics of saccadic eye movements when viewing different visual environments. Vision Research, 39, 2947–2953.10.1016/S0042-6989(99)00019-X
  • Biehl, M., Matsumoto, D., Ekman, P., Hearn, V., Heider, K., Kudoh, T., & Ton, V. (1997). Matsumoto and Ekman’s Japanese and Caucasian facial expressions of emotion (JACFEE): Reliability data and cross-national differences. Journal of Nonverbal Behavior, 21, 3–21. doi:10.1023/A:1024902500935
  • Birdwhistell, R. (1970). Kinesics and context. Philadelphia, PA: University of Pennsylvania Press.
  • Blais, C., Jack, R. E., Scheepers, C., Fiset, D., & Caldara, R. (2008). Culture shapes how we look at faces. PLoS ONE, 3, e3022. doi:10.1371/journal.pone.0003022
  • Boloorizadeh, P., & Tojari, F. (2013). Facial expression recognition: Age, gender and exposure duration impact. Procedia – Social and Behavioral Sciences, 84, 1369–1375.10.1016/j.sbspro.2013.06.758
  • Calder, A. J., Young, A. W., Keane, J., & Dean, M. (2000). Configural information in facial expression perception. Journal of Experimental Psychology: Human Perception and Performance, 26, 527–551.
  • Darwin, C. (1872). The expression of the emotions in man and animals. New York, NY: Philosophical Library.10.1037/10001-000
  • Davies, A. G. (2011). Radio stations in Malaysia. Asiawaves. Retrieved August 25, 2011, from http://www.asiawaves.net/malaysia-radio.htm
  • Ekman, P. (1970). Universal facial expressions of emotion. California Mental Health Research Digest, 8, 151–158.
  • Ekman, P. (1992). Facial expressions of emotion: New findings, new questions. Psychological Science, 3, 34–38.10.1111/psci.1992.3.issue-1
  • Ekman, P., & Friesen, W. V. (1971). Constants across cultures in the face and emotion. Journal of Personality and Social Psychology, 17, 124–129.10.1037/h0030377
  • Ekman, P., Friesen, W. V., & O’Sullivan, M. (1988). Smiles when lying. Journal of Personality and Social Psychology, 54, 414–420.10.1037/0022-3514.54.3.414
  • Ellison, J. W., & Massaro, D. W. (1997). Featural evaluation, integration, and judgment of facial affect. Journal of Experimental Psychology: Human Perception and Performance, 23, 213–226.
  • Epstein, J. (2011). World domination by box office cinema admissions. Greenash. Retrieved August 25, 2011, from http://greenash.net.au/thoughts/2011/07/world-domination-by-box-office-cinema-admissions/
  • Friesen, W. V. (1972). Cultural differences in facial expression in a social situation: An experimental test of the concept of display rules (Unpublished doctoral dissertation). University of California, San Francisco.
  • Gaudart, H. (1987). English language teaching in Malaysia: A historical account. The English Teacher, XVI. Retrieved from http://www.melta.org.my/ET/1987/main2.html
  • Hayhoe, M., & Ballard, D. (2005). Eye movements in natural behavior. Trends in Cognitive Sciences, 9, 188–194.10.1016/j.tics.2005.02.009
  • Heine, S. J., Lehman, D. R., Markus, H. R., & Kitayama, S. (1999). Is there a universal need for positive self-regard? Psychological Review, 106, 766–794.10.1037/0033-295X.106.4.766
  • Jack, R. E., Blais, C., Scheepers, C., Schyns, P. G., & Caldara, R. (2009). Cultural confusions show that facial expressions are not universal. Current Biology, 19, 1543–1548.10.1016/j.cub.2009.07.051
  • Jack, R. E., Caldara, R., & Schyns, P. G. (2011). Internal representations reveal cultural diversity in expectations of facial expressions of emotion. Journal of Experimental Psychology, 11, 1–7. doi:10.1037/a0023463
  • Kelly, D. J., Jack, R. E., Miellet, S., De Luca, E., Foreman, K., & Caldara, R. (2011). Social experience does not abolish cultural diversity in eye movements. Frontiers in Psychology, 2, 1–11.
  • Kitayama, S., Markus, H. R., & Kurokawa, M. (2000). Culture, emotion, and well-being: Good feelings in Japan and the United States. Cognition and Emotion, 14, 93–124.10.1080/026999300379003
  • Klineberg, O. (1940). Social psychology. New York, NY: Holt.10.1037/13603-000
  • Knutson, B. (1996). Facial expressions of emotion influence interpersonal trait inferences. Journal of Nonverbal Behavior, 20, 165–182.10.1007/BF02281954
  • Mai, X., Ge, Y., Tao, L., Tang, H., Liu, C., & Luo, Y. (2011). Eyes are windows to the Chinese soul: Evidence from the detection of real and fake smiles. PLoS ONE, 6, e19903. doi:10.1371/journal.pone.0019903
  • Martin, D., Slessor, G., Allen, R., Phillips, L. H., & Darling, S. (2012). Processing orientation and emotion recognition. Emotion, 12, 39–43.10.1037/a0024775
  • Matsumoto, D., & Willingham, B. (2009). Spontaneous facial expressions of emotion of congenitally and noncongenitally blind individuals. Journal of Personality and Social Psychology, 96, 1–10. doi:10.1037/a0014037
  • Moriguchi, Y., Ohnishi, T., Kawachi, T., Mori, T., Hirakata, M., Yamada, M., … Komaki, G. (2005). Specific brain activation in Japanese and Caucasian people to fearful faces. NeuroReport, 16, 133–136.10.1097/00001756-200502080-00012
  • Rotter, N. G., & Rotter, G. S. (1988). Sex differences in the encoding and decoding of negative facial emotions. Journal of Nonverbal Behavior, 12, 139–148.10.1007/BF00986931
  • Sawada, R., Sato, W., Kochiyama, T., Uono, S., Kubota, Y., Yoshimura, S., & Toichi, M. (2014). Sex differences in the rapid detection of emotional facial expressions. PLoS ONE, 9, e94747. doi:10.1371/journal.pone.0094747
  • Snodgrass, J. G., & Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to dementia and amnesia. Journal of Experimental Psychology: General, 117, 24–50.
  • Sullivan, S., Ruffman, T., & Hutton, S. B. (2007). Age differences in emotion recognition skills and the visual scanning of emotion faces. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 62, P53–P60.10.1093/geronb/62.1.P53
  • Tan, C. B. Y., Stephen, I. D., Whitehead, R., & Sheppard, E. (2012). You look familiar: How Malaysian Chinese recognize faces. PLoS ONE, 7, e29714. doi:10.1371/journal.pone.0029714
  • Thayer, J. F., & Johnsen, B. H. (2000). Sex differences in judgement of facial affect: A multivariate analysis of recognition errors. Scandinavian Journal of Psychology, 41, 243–246.10.1111/sjop.2000.41.issue-3
  • Tiddeman, B., & Perrett, D. (2001, Feb 5–9). Moving facial image transformations based on static 2d prototypes. Proceedings of 9th International Conference In Central Europe on Computer Graphics, Visualization and Computer Vision 2001. Pilsen.
  • Uchida, Y., Kitayama, S., Mesquita, B., Reyes, J. A. S., & Morling, B. (2008). Is perceived emotional support beneficial? Well-Being and health in independent and interdependent cultures. Personality and Social Psychology Bulletin, 34, 741–754.10.1177/0146167208315157
  • Weston, N. J., & Perfect, T. J. (2005). Effects of processing bias on the recognition of composite face halves. Psychonomic Bulletin & Review, 12, 1038–1042.
  • Yarbus, A. (1967). Eye movements and vision. New York, NY: Plenum Press.10.1007/978-1-4899-5379-7
  • Yuki, M., Maddux, W. W., & Masuda, T. (2007). Are the windows to the soul the same in the East and West? Cultural differences in using the eyes and mouth as cues to recognize emotions in Japan and the United States. Journal of Experimental Social Psychology, 43, 303–311.10.1016/j.jesp.2006.02.004