MRC Logo
Cognition and Brian Sciences Unit
Home page
About the unit
People
Research
Imaging
Seminars
Jobs
Post Grads
Pannel
Links
MRC Home page
Back Forward

Search for:

The Perception of Facial Expressions

The human face provides the most salient cue to another person’s emotional state. However, the vast majority of research in this area has focused on the communicative value of facial expressions, and few studies have investigated the perceptual processes preceding this stage. As a result, there is no detailed model of facial expression recognition. The following studies addressed the perceptual mechanisms underlying the recognition of facial affect, with a view to developing a model of facial expression processing.

Categorical Perception of facial expressions

Our studies have shown that computer-generated continua ranging between two prototype facial expressions are perceived categorically. Hence, for a continuum ranging between sadness and anger, the images on one side are consistently identified as "sadness" and those in the other side as "anger", with a sharp boundary separating the two. Moreover, pairs of images that straddle the category boundary are more easily discriminated than pairs that lie within either side of the boundary.


Facial expression Megamix

Building on the findings of the previous categorical perception study we prepared all possible continua ranging between facial expressions associated with six basic emotions, happiness, sadness, anger, fear, disgust and surprise; this resulted in 15 separate continua each containing five physically equidistant steps (75 images in total. Figure 1 shows examples of these continua.

Figure 1: The six rows of this illustration contain morphed (blended) continua ranging between the following six expression pairs. From top to bottom, the continua shown in each row are happiness - surprise (top row), surprise - fear (second row), fear - sadness (third row), sadness - disgust (fourth row), disgust - anger (fifth row), anger - happiness (bottom row). Going from left to right, the columns show 90%, 70%, 50%, 30%, and 10% morphs along each continuum. For example, from left to right, the top row of images contain the following percentages of the happy and surprised expressions, 90% happy – 10% surprise, and then 70% - 30%, 50%- 50%, 30% -70%, and 10% - 90% of the same two expressions. Data from neurologically intact participants show that stimuli that contain 90% and 70% of an expression are consistently identified as the intended emotion. 

Participants were presented with these images in random order and asked to indicate whether each image was most like happiness, sadness, anger, fear, disgust, or surprise. The images in each continuum were consistently categorised as the emotion corresponding to the nearest endpoint expression; for example, images in the happiness-surprise continuum were labeled happiness if they were nearest the happiness endpoint and surprise, if they were nearest the surprise endpoint (see Figure 2). The images falling at the boundary region of each continuum was categorised as either of the two endpoints with approximately equal frequency. Importantly, this boundary image was not categorised as an expression other than the two boundary expressions. 

Figure 2:  Percentage identifications of the stimuli shown in Figure 1.  The images were presented individually in random order.  The participants’ task was to categorise  the expression on each face with one of six emotion labels – happiness, sadness, anger, fear, disgust or surprise.

The pairs of adjacent images from Figure 1 were also used in an ABX discrimination task. The results showed that subjects found it easier to discriminate between pairs that straddled the category boundary between two expressions than pairs that fell within either side of the boundary (Figure 3).

Figure 3: Predicted and observed discrimination of pairs faces in an ABX task.  Pairs consisted of adjacent stimuli from Figure 1.  Predicted data were calculated from the identification curves shown in Figure 2.

Caricaturing facial expressions

Recognition

Photographic-quality caricatures of emotional facial expressions were generated by exaggerating the physical differences between a target expression (e.g., anger) and a reference norm face (e.g., a neutral expression or an average of all six expressions); 'anticaricatures' were produced by reducing these differences (Figure 4). Caricatured expressions were identified significantly faster than the original images used to prepare them, whereas anticaricatures were recognised significantly slower than the original images.

 

Figure 4: Reference norms (top) and six expressions caricatured at three levels of exaggeration (-50%, 0%, and +50%).  The expressions were caricatured relative to two different types of norm; an average-expression norm (left) and a neutral-expression norm (right).  From top to bottom the six emotions shown are happiness (top row), surprise (second row), fear (third row), sadness (fourth row), disgust (fifth row), and anger (bottom row).

Figure 5: Participants' mean correct reaction times to identify the three levels of caricature (-50%, 0%, and +50%) prepared relative to neutral-expression and average-expression norms. 

Rated emotional intensity

When subjects are asked to rate the emotional intensity of facial expressions caricatured at a number of different levels, emotional intensity shows a highly significant linear function with level of caricature. This pattern is found regardless of whether the reference face (norm face) used to prepare the caricatures shows a neutral facial expression, an average facial expression or another prototype expression (e.g., fear caricatured relative to an anger expression) (Figure 6)

Figure 6: Participants' mean intensity ratings with standard error bars are shown for facial expressions of anger (left), fear (middle), and sadness (right) caricatured at four levels of exaggeration (0%, +15%, +30%, and +50%) and relative to three different types of expression norms; a neutral-expression norm (Neutral), an average-expression norm (Average), and different-expression norms (Different).  

These results are difficult to reconcile with two-dimensional models of facial expression representation, such as Russell’s Circumplex model (Figure 7). This is because although these models can accommodate the effect of caricaturing relative to a neutral expression, they can not account for the fact that a facial expression's emotional intensity is enhanced when it is caricatured relative to any other facial expression. 

Figure 7: Russells’ (1980) Circumplex model of emotion.  A two dimensional system in which emotional cues (e.g., facial expressions) are identified by registering their values on two orthogonal dimensions coding degree of pleasure and degree of arousal.

Configural coding of facial expressions

It is well established that configural information (the relationships between facial features) plays an important role in coding a face’s identity (who the person is). However, its contribution to facial expression recognition is less well understood. In fact, some researchers have suggested that facial expressions are processed in a part-based (non-configural) manner (Ellison & Massaro, 1997). We have addressed this issue using a composite paradigm (Calder, Young, Keane, & Dean, 2000d).  Our study showed that participants were slower to identify the expression in either half of ‘composite’ facial expressions (faces in which the top half of one expression (e.g., anger) was aligned with the bottom half of another (e.g., happiness) to create a novel expression configuration) relative to a 'noncomposite' control condition in which the two face halves were misaligned (Figures 8). These findings parallel the composite effect for facial identity (Young, Hellawell, & Hay, 1987). However, additional experiments showed that the identity and expression effects operate independently of one another; indicating that the configural cues to these two facial attributes are qualitatively different. This research complements the findings of the PCA (see below) which showed that identity and expression are represented by separate principal components.  In line with this observation Cottrell (California, San Diego) and Calder have modelled these composite data in a PCA system (Cottrell, Branson, & Calder, 2002).

Figure 8 : Examples of composite and noncomposite stimuli.  The top and bottom segments of different facial expressions posed by the same model were combined to create composite and noncomposite  facial expressions. 

A Principal component analysis (PCA) of facial expressions

It is generally agreed that facial identity recognition (who the person is) and facial expression recognition (what they are feeling) share the same front-end (perceptual) system. Previous research has shown that a principal component analysis (PCA) of the visual information in faces provides an effective front-end account of facial identity processing (Burton, et al. 1999). Hence, a critical question is whether PCA can also support the recognition of facial expressions.

We addressed this by submitting the pixel intensities of pictures of facial expressions from the Ekman and Friesen (1976) series to a PCA (Calder, Burton, Miller, Young, & Akamatsu, 2001a). The results showed that PCA provides an effective means of coding the identity, expression, and sex of people’s faces (Figure 9). For facial expressions, the correct recognition rates and false positives derived from the principal components were well matched to human performance (Table 1). In addition, the model exhibited properties of two competing accounts of facial expression processing (dimensional and category-based models); providing a means of bridging-the-gap between, what were generally perceived to be distinct theoretical accounts. Finally, consistent with research showing that facial identity and facial expression recognition can be selectively disrupted, our research found that cues to identity and expression were coded by largely separate sets of principal components (Figure 10a&b). This research shows that linearised compact coding of human faces can provide a plausible account of the psychological data for both facial identity and facial expression processing.

Figure 9: The first eight eigenfaces abstracted from a principal component analysis (PCA) of the Ekman and Friesen (1976) faces.

PCA

 
 

Categorisation

       
 

Anger

Disgust

Fear

Happy

Neutral

Sad

Surprise

Facial Expressions

         

Anger

86%

14%

0%

0%

0%

0%

0%

Disgust

17%

72%

0%

11%

0%

0%

0%

Fear

0%

0%

97%

3%

0%

0%

0%

Happy

0%

0%

0%

98%

2%

0%

0%

Neutral

0%

0%

0%

1%

69%

23%

7%

Sad

0%

0%

0%

2%

26%

72%

0%

Surprise

0%

0%

6%

0%

2%

0%

92%

               
       

Total correctly identified

84%

Human recognition          
 

Categorisation

       
 

Anger

Disgust

Fear

Happy

Neutral

Sad

Surprise

Facial Expressions

         

Anger

73%

10%

2%

0%

6%

4%

5%

Disgust

15%

79%

0%

0%

3%

1%

1%

Fear

1%

2%

76%

0%

1%

2%

18%

Happy

0%

0%

0%

98%

2%

0%

0%

Neutral

4%

1%

0%

3%

88%

3%

1%

Sad

1%

8%

5%

0%

9%

74%

3%

Surprise

0%

0%

10%

1%

0%

0%

89%

               
       

Total correctly identified

82%

Table 1: Confusion matrices for the Ekman and Friesen (1976) facial expressions corresponding to (i) the probability of membership values produced by a linear discriminant analysis (LDA) of PCA data (top), and (ii) the human participants categorisation of the facial expressions (bottom).  The two matrices have the same format.  The vertical labels on the left indicate the intended facial expressions (as defined by Ekman and Friesen, 1976), labels across the top of the table indicate the degree of certainty with which each facial expression type was categorised as happy, sad, etc.

 

   

 

 
   

Figure 10a: Animated sequences of reconstructed images are shown for each of two eigenfaces that are important for categorising facial expression: component 5 (PC5) and component 9 (PC9). For each sequence, all eigenfaces, except the eigenface of interest, have been weighted with the appropriate component values for model PE posing a neutral expression. The weights applied to the eigenface of interest have been varied from -3 sd (far left image) to +3 sd (far right image) from the overall mean component value. The animation on the left shows elements of a surprise expression (raised eyebrows and widened eyes), while the animation on right shows elements of a “Duchenne smile” - a genuine smile incorporating muscle changes in the mouth and eye regions.

 
   

 

 
   
Figure 10b: Animated sequences of reconstructed images are shown for each of two eigenfaces that are important for categorising facial identity: component 1 (PC1) and component 7 (PC7). For each sequence, all eigenfaces, except the eigenface of interest, have been weighted with the appropriate component values for model PE posing a neutral expression. The weights applied to the eigenface of interest have been varied from -3 sd (far left image) to +3 sd (far right image) from the overall mean component value. The animation on the left shows changes in face width and hair tone, while the animation on right shows changes in the width of the nose and hair.
 

TOP


MRC Cognition and Brain Sciences Unit aaron@mrc-cbu.cam.ac.uk zee@mrc-cbu.cam.ac.uk
Contact Information