Giter Club home page Giter Club logo

mtclar-fsl's Introduction

Efficient Labelling of Affective Video Datasets via Few-Shot & Multi-Task Contrastive Learning

This is the official PyTorch code repository for the paper Efficient Labelling of Affective Video Datasets via Few-Shot & Multi-Task Contrastive Learning.

Accepted at ACM Multimedia 2023.

Link to the paper: https://arxiv.org/abs/2308.02173

Teaser: One Drive


Problem overview

Abstract

Whilst deep learning techniques have achieved excellent emotion prediction, they nevertheless require large amounts of labelled training data, which are (a) onerous and tedious to compile, and (b) prone to errors and biases. We propose Multi-Task Contrastive Learning for Affect Representation (MT-CLAR) for few-shot affect inference. MT-CLAR combines multi-task learning with a Siamese network trained via contrastive learning to infer from a pair of expressive facial images (a) the (dis)similarity between the facial expressions, and (b) the difference in valence and arousal levels of the two faces. We further extend the image-based MT-CLAR framework for automated video labelling wherein, given one or a few labelled video frames (termed support-set), MT-CLAR labels the remainder of the video for valence and arousal. Experiments are performed on the AFEW-VA dataset with multiple support-set configurations; additionally, supervised learning on representations learned via MT-CLAR are utilised for valence, arousal and categorical emotion prediction on the AffectNet and AFEW-VA datasets. Empirical results confirm that valence and arousal predictions through MT-CLAR are very comparable to the state-of-the-art (SOTA), and we significantly outperform SOTA with a support-set $\approx$ 6% the size of the video dataset.

Requirements

This code was tested on Python 3.8.13, PyTorch 1.13 (dev), and CUDA 11.3. It is recommended to use a virtual environment to install the dependencies. To install the dependencies, run the following command:

conda env create -f environment.yml
conda activate mtclar

Datasets

The AFEW-VA dataset can be downloaded from here. The AffectNet dataset can be downloaded from here, after requiring prior approvals from the authors of the dataset. The AFEW-VA dataset should be placed in the data/AfewVA directory, and the AffectNet dataset should be placed in the data/AffectNet directory.

Pre-trained models

The pre-trained models can be downloaded from here.

Training

MT-CLAR architecture

To train and evaluate models, configure the config.yaml file accordingly to the desired experiment. For example, the dataset field can be set to afewva, or affectnet to train the MT-CLAR + SL in mtclar_sl_config.yaml

To train and evaluate the MT-CLAR model, run the following command:

python mtclar.py

To evaluate the MT-CLAR model with the AFEW-VA support-set, run the following command:

python fsl_mtclar.py

To train the MT-CLAR + SL model, run the following command:

python mtclar_sl.py

mtclar-fsl's People

Contributors

ravikiranrao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.