Giter Club home page Giter Club logo

mvc's Introduction

Multimodal Variational Contrastive Learning for Few-Shot Classification (MVC)

PyTorch implementation for the paper: Multimodal Variational Contrastive Learning for Few-Shot Classification

Dependencies

  • python 3.6.5
  • numpy 1.16.0
  • torch 1.8.0
  • tqdm 4.57.0
  • scipy 1.5.4
  • torchvision 0.9.0

Overview

The effectiveness of metric-based few-shot learning methods heavily relies on the discriminative ability of the prototypes and feature embeddings of queries. However, using instance-level unimodal prototypes often falls short in capturing the essence of various categories. To this end, we propose a multimodal variational contrastive learning framework that aims to enhance prototype representativeness and refine the discrimination of query features by acquiring distribution-level representations. To elaborate, our approach commences by training a variational auto-encoder through supervised contrastive learning in both the visual and semantic spaces. The trained model is employed to augment the support set by repetitive sampling features from the learned semantic distributions and generate pseudo-semantics for queries to achieve information balance across samples in both the support and query sets. Furthermore, we establish a multimodal instance-to-distribution model that learns to transform instance-level multimodal features into distribution-level representations via variational inference, facilitating robust metric. Empirical experiments conducted across several benchmarks consistently demonstrates the superiority of our method in terms of classification accuracy and robustness. Image text

Download the Datasets

Running Experiments

If you want to train the models from scratch, please run the run_pre.py first to pretrain the backbone. Then specify the path of the pretrained checkpoints to "./checkpoints/[dataname]"

  • Run pretrain phase:
python run_pre.py
  • Run few-shot training and test phases:
python run_mvc.py

LISENCE

  • All materials are made available under the terms of the Creative Commons Attribution-NonCommercial 4.0 International Public License (CC BY-NC 4.0) license. You can find details at: https://creativecommons.org/licenses/by-nc/4.0/legalcode

  • The license gives permission for academic use only.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.