Giter Club home page Giter Club logo

mp_customcoop's Introduction

MP_CustomCoOp

Tweaking CoOp for a multilangual object class detection. Therefore, we like to use other pretrained models from open clip and use other datasets more suited for a multilangual setting.

Setup

This repository builds upon Dassl.pytorch. Here are the steps to get you going:

  1. Clone the Dassl repo
git clone https://github.com/KaiyangZhou/Dassl.pytorch.git
cd Dassl.pytorch/
  1. Create a conda environment
conda create -y -n dassl python=3.10
  1. Activate the environment
conda activate dassl
  1. Install torch (requires version >= 1.8.1) and torchvision (Please refer to https://pytorch.org/ if you need a different cuda version)
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
  1. Install dependencies
pip install -r requirements.txt
  1. Install this library (no need to re-build if the source code is modified)
python setup.py develop
  1. Switch to this repository and install dependecies
cd ..
cd MP_CustomCoOp
pip install -r requirements.txt

Dataset

For the moment only the Caltech dataset was tested, the original repository supported multiple datasets, including imagenet, food101, eurosat, oxford pets etc. To get this running with Caltech simply:

The directory structure should look like

caltech-101/
|–– 101_ObjectCategories/
|–– split_zhou_Caltech101.json

(For other datasets look into the CoOp repo and the DATASETS.md`)

How to run

At the moment this repository only supports CoOp (and not CoCoOp!). Two examples of how to use this repo: (ALWAYS SUBSTITUTE $DATA with either the absolute path to the datafolder on your system or relativ to this folder!)

  1. With the original clip models (to get the same results as in the CoOp paper):
python train.py $DATA rn50
  1. With the new open clip models:
python train.py $DATA roberta-vit-b32 --shots 16 --open_Clip --pretrained laion2b_s12b_b32k

You can also use the same models as the original clip but from the open clip repository

python train.py $DATA rn50 --open_Clip 

(Note for the same trained models we do not need to set a pretrained tag, since it defaults to openai)

!! Checkout the list available models script to check which models with which pretrained tags are supported!!

python list_available_models.py 

Evaluation

Substituting the clip models with "the same" open clip models we get the following results:

To evaluate your model simply call:

python eval.py 

This evaluates all files in the output folder. In our case we got the following results:

Test and train accuracy with standard deviation of different runs.
open_clip_roberta_vit_b32_openai_1.txt test: 36.6% +- 3.8% and train: 63.2% +- 33.4
open_clip_xlm_roberta_base_vit_b_32_openai_1.txt test: 38.3% +- 2.3% and train: 64.6% +- 34.9
clip_rn_50_openai_1.txt test: 83.3% +- 3.1% and train: 87.3% +- 9.8
clip_vit_b_32_openai_1.txt test: 87.4% +- 0.9% and train: 90.4% +- 7.4
open_clip_vit_b_32_openai_1.txt test: 73.6% +- 5.9% and train: 75.7% +- 27.5
open_clip_vit_b_32_laion2b_s34b_b79k.txt test: 85.1% +- 1.5% and train: 87.4% +- 20.0
clip_rn_101_openai_1.txt test: 85.4% +- 3.1% and train: 88.1% +- 8.9
open_clip_rn_50_openai_16.txt test: 88.5% +- 0.7% and train: 58.4% +- 25.5
open_clip_rn_101_openai_1.txt test: 80.2% +- 1.2% and train: 86.0% +- 20.1
open_clip_rn_50_openai_1.txt test: 4.5% +- 0.0% and train: 3.6% +- 3.0
open_clip_roberta-vit-b32_laion2b_s12b_b32k_16.txt test: 88.6% +- 0.5% and train: 70.4% +- 18.0
open_clip_vit_b_16_openai_1.txt test: 86.1% +- 2.1% and train: 87.4% +- 16.5
clip_vit_b_16_openai_1.txt test: 89.7% +- 1.8% and train: 90.5% +- 7.2

mp_customcoop's People

Contributors

brandnerkasper avatar

Watchers

Gregor Geigle avatar Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.