Introduction

Facial expression analysis is a domain of interest in deep and transfer learning. The way in which human beings express their emotions has always been a topic of interest in psychological study. External factors have always proven to play a major role as different groups of people show different reaction to the same stimuli, and the way of their emotional expression sets them apart from any other group of interest. The common factors in this change of expressions are observed as gender, cultural influences, age, and the environment. Expression of the commonly observed emotions is specific to the individuals, based on their personality traits. If the training data is an unbiased sample of an underlying distribution, then the learned classification function will make accurate predictions for new samples. However, if the training data is not an unbiased sample, then there will be differences between how the training data is distributed and how the test data is distributed. This becomes the major challenge in domain adaptation. In this problem, our main focus is the facial expression recognition, as the reaction to a stimulus is first observed through face. As cultural factors influence the display of emotions, and we are dealing with the Indo-Pak ethnicity, we show how an ethnicity-specific classifier is built using the target domain data that is unlabeled. The assumptions made by our proposed method will not be based on specific features used for emotional representation, so the proposed solution can be applied to a different kind of data also. We propose to use Cycle-GAN for the purpose of mapping source domain to target domain, and conduct the comparative analysis with results of conventional GAN model. Furthermore, we applied feature-space domain adaptation to our problem, which resulted in significant improvements.

Dataset

Used RAF_DB as sourced dataset and Self collected Pakistani facial images as target dataset

Source Dataset

Real-world Affective Faces Database (RAF_DB) contains 15338 coloured images of westren facial expressions, we are using only seven classes although RAF_DB contains more than seven classes and around 30K images.

Source Dataset

Target Dataset

We have collected around 4000 coloured Pakistani facial expression images

Target Dataset

Experiments and Results

Unsupervised Domain Adaptation using WGAN — WGAN results were not useable. So this approach was discontinued.
Semi-supervised Domain Adaptation using CycleGAN.
Feature Space Unsupervised Domain Adaptation.

We have used two target datasets in our experimentation. The first dataset is used in domain adaptation process and second dataset is kept unseen in all the ways for testing purposes. This was done to ensure model performance consistency on target domain. We used two classifiers in our experimentation. One is VGG16 pre-trained on ImageNet Dataset and second is ResNET18 pre-trained on ImageNet Dataset. These classifiers were trained on source domain and their accuracies on source domain are below.

Baseline results

Source Domain Accuracy Results

Classifier	Source Domain Training Accuracy	Source Domain Testing Accuracy
VGG16	94.8	79.45
ResNET18	92.3	80.17

In our experimentation, we first evaluated our classifiers (VGG16 and ResNET18) on target domain without doing any kind of domain adaptation. The baseline results for the classifiers used are provided in following table

Target Domain Accuracy Results

we first evaluated our classifiers (VGG16 and ResNET18) on target domain without doing any kind of domain adaptation.

Classifier	Target Dataset 1 Accuracy (Unseen)	Target Dataset 2 Accuracy (Unseen)
VGG16	50.92	37.51
ResNET18	50.75	33.51

Direct Fine-tuning on Target Dataset Accuracy

Then in our next experiment, we fine-tuned our classifiers directly on target domain to get an upper bound of accuracies on target domain for each classifier.

Classifier	Target Dataset 1 Accuracy (Used in Fine-tuning)	Target Dataset 2 Accuracy (Unseen)
VGG16	92.23	42.75
ResNET18	96.47	43.03

Baseline Results Confusion Matrix

Confusion Matrices for baseline results. (a) VGG16 results on Target Dataset 1, (b) VGG16 results on Target Dataset 2, (c) ResNET18 results on Target Dataset 1, (d) ResNET18 results on Target Dataset 2

Results

Fine-tuned on Target Domain Confusion Matrix

Classifiers fine-tuned on target dataset directly. (a) VGG16 results on Target Dataset 1, (b) VGG16 results on Target Dataset 2, (c) ResNET18 results on Target Dataset 1, (d) ResNET18 results on Target Dataset 2

Results

Baseline Trained Models

Baseline trained models can be found here.

WGAN

Training Specifications of WGAN Models

Specifications	WGAN Arch. 1	WGAN Arch. 2	WGAN Arch. 3
Model Type	Linear	Linear	Convolutional
Training Epochs	10K	1K	7.5K
Training Time	7 Days	3 Days	7 Days

WGAN Trained Models

WGAN Trained models are available here.

CycleGAN

Fine Tune CycleGAN

Using CycleGAN translated images, we fine-tuned our classifiers and accuracy score on both target datasets are below.

Classifier	Target Dataset 1 Accuracy (Used in Fine-tuning)	Target Dataset 2 Accuracy (Unseen)
ResNET18	48.13	33.42

CycleGAN Translated Results

Fine-tuned on CycleGAN Translated Samples Confusion Matrix

Classifiers fine-tuned on CycleGAN translated samples. (a) ResNET18 results on Target Dataset 1, (b) ResNET18 results on Target Dataset 2

Results

CycleGAN Trained Models

CycleGAN trained models are available here.

Feature Space Unsupervised Domain Adaptation CycleGAN

We retrained both the classifier with an additional domain classifier network in them. This domain classifier network help in making the features used in classifier independent of any domain information.

Classifier	Target Dataset 1 Accuracy (Used in Fine-tuning)	Target Dataset 2 Accuracy (Unseen)
VGG16	51.36	37.37
ResNET18	46.72	32.41

Trained using Feature Space Domain Adaptation Confusion Matrix

Classifiers trained using feature space domain adaptation approach. (a) VGG16 results on Target Dataset 1, (b) VGG16 results on Target Dataset 2t, (c) ResNET18 results on Target Dataset 1, (d) ResNET18 results on Target Dataset 2

Results

Feature Space Domain Adaptation Models

Feature Space Unsupervised Domain Adaptation models are available here.

Downloads

Contributors

Jawad Tariq MSDS19038@itu.edu.pk	Muhammad Sohaib Khalid MSDS19096@itu.edu.pk
Amna Shahbaz MSDS19060@ity.edu.pk	Asif Ejaz MSDS19010@itu.edu.pk
Muhammad Taimur Adil MSDS19040@itu.edu.pk

Domain Adaptation for FER

Western to Pakistani Face Emotions Detection

Introduction

Dataset

Source Dataset

Target Dataset

Experiments and Results

Baseline results

Source Domain Accuracy Results

Target Domain Accuracy Results

Direct Fine-tuning on Target Dataset Accuracy

Baseline Results Confusion Matrix