Giter Club home page Giter Club logo

hadamard-matrix-for-hashing's Introduction

Codes for paper: Central Similarity Quantization for Efficient Image and Video Retrieval, arxiv

We release all codes and configurations for image hashing.

Update: Video hashing has been updated in here

Prerequisties

Ubuntu 16.04

NVIDIA GPU + CUDA and corresponidng Pytorch framework (v0.4.1)

Python 3.6

Datasets

  1. Download database for the retrieval list of imagenet in the anonymous link here, and put database.txt in 'data/imagenet/'

  2. Download MS COCO, ImageNet2012, NUS_WIDE in their official website: COCO, ImageNet, NUS_WIDE. Unzip all data and put in 'data/dataset_name/'.

Hash center (target)

Here, we put hash centers for imagenet we used in 'data/imagenet/hash_centers'. The methods to generate hash centers are given in the tutorial: Tutorial_ hash_center_generation.ipynb

Test

Pretrained models are Google Drive, or you can directly download it from the release.

It will take a long time to generate hash codes for database, because of the large-scale data size for database

Test for imagenet:

Download pre-trained model 'imagenet_64bit_0.8734_resnet50.pkl' for imagenet, put it in 'data/imagenet/', then run:

python test.py --data_name imagenet --gpus 0,1  --R 1000  --model_name 'imagenet_64bit_0.8734_resnet50.pkl' 

Test for coco:

Download pre-trained model 'coco_64bit_0.8612_resnet50.pkl' for coco, put it in 'data/coco/', then run:

python test.py --data_name coco --gpus 0,1  --R 5000  --model_name 'coco_64bit_0.8612_resnet50.pkl' 

Test for nus_wide:

Download pre-trained model 'nus_wide_64bit_0.8391_resnet50.pkl' for nus_wide, put it in 'data/nus_wide/', then run:

python test.py --data_name nus_wide --gpus 0,1  --R 5000  --model_name 'nus_wide_64bit_0.8391_resnet50.pkl' 

The MAP of retrieval on the three datasets are shown in the following:

Dataset MAP(16bit) MAP(32bit) MAP(16bit)
ImageNet 0.851 0.865 0.873
MS COCO 0.796 0.838 0.861
NUS WIDE 0.810 0.825 0.839

Train

Train on imagenet, hash bit: 64bit

Trained model will be saved in 'data/imagenet/models/'

python train.py --data_name imagenet --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05  --R 1000

Train on coco, hash bits: 64bit

Trained model will be saved in 'data/coco/models/'

python train.py --data_name coco --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05 --multi_lr 0.05  --R 5000

Train on nus_wide, hash bit: 64bit

Trained model will be saved in 'data/nus_wide/models/'

python train.py --data_name nus_wide --hash_bit 64 --gpus 0,1 --model_type resnet50 --lambda1 0  --lambda2 0.05  --multi_lr 0.05 --R 5000

AlexNet as backbone.

Pretrained models of AlexNet are here. Pre-trained models for COCO will be given in the future

The MAP of retrieval on ImageNet and NUS_WIDE are shown in the following:

Dataset MAP(16bit) MAP(32bit) MAP(64bit)
ImageNet 0.601 0.653 0.695
NUS_WIDE 0.744 0.785 0.789

Train on ImageNet, 16bit

python train.py --data_name imagenet --hash_bit 16 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on ImageNet, 32bit

python train.py --data_name imagenet --hash_bit 32 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on ImageNet, 64bit

python train.py --data_name imagenet --hash_bit 64 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.0001  --R 1000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 16bit

python train.py --data_name nus_wide --hash_bit 16 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 32bit

python train.py --data_name nus_wide --hash_bit 32 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Train on NUS_WIDE, 64bit

python train.py --data_name nus_wide --hash_bit 64 --gpus 2 --model_type Alexnet --lambda1 0  --lambda2 0.001  --R 5000 --eval_frequency 1 --lr 0.0001

Reference

If you find this repo useful, please consider citing:

@inproceedings{yuan2020central,
  title={Central Similarity Quantization for Efficient Image and Video Retrieval},
  author={Yuan, Li and Wang, Tao and Zhang, Xiaopeng and Tay, Francis EH and Jie, Zequn and Liu, Wei and Feng, Jiashi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3083--3092},
  year={2020}
}

hadamard-matrix-for-hashing's People

Contributors

yuanli2333 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hadamard-matrix-for-hashing's Issues

About loss function

Why is the code different from the loss function in the paper? Q_loss is completely different from LQ in the text.

pairwise loss

hi,

The paper doesn't use any pairwise loss. Is this a joke? why are you implementing something different that the paper's claim?

thanks in advance

Video Hashing

Thanks for the great work!
I 'm interested in your video hashing work. Can you release your video hashing code?
My email is [email protected]. Hope to communicate with me.

class num more than 40,000

hi~ I want to train CSQ on a person reid task.
the class num is more than 40,000
the code which was used to set the hash target center can be extremly time consumming,
do you have any suggestion?
@yuanli2333

`

  if H_2K.shape[0] < n_class:
        hash_targets.resize_(n_class, bit)
        for k in range(20):
            for index in range(H_2K.shape[0], n_class):
                ones = torch.ones(bit)
                # Bernouli distribution
                sa = random.sample(list(range(bit)), bit // 2)
                ones[sa] = -1
                hash_targets[index] = ones
            # to find average/min  pairwise distance
            c = []
            for i in range(n_class):
                for j in range(n_class):
                    if i < j:
                        TF = sum(hash_targets[i] != hash_targets[j])
                        c.append(TF)
            c = np.array(c)

            # choose min(c) in the range of K/4 to K/3
            # see in https://github.com/yuanli2333/Hadamard-Matrix-for-hashing/issues/1
            # but it is hard when bit is  small
            if c.min() > bit / 4 and c.mean() >= bit / 2:
                print(c.min(), c.mean())
                break

`

How to create the train.txt and test.txt?

I have a question that how ro create the train.txt and test.txt the format of which are different from my own dataset . And what's the meaning of the hyper-parameters: "lambda0","lambda1","lambda2". Thank you!

give an image to retrieval its similar images

I run the source code,the train.py and test.py,and the results are great.And the paper show a example which give an image to find other similar images, but it seems the code doesn't include it. Could you tell me how to achive it?Is it just take an image as input or need label?

The model use pretrain weight for ImageNet training?

Hi, thank you for good work.

I have some question on ImageNet model training.
I think the model use ImageNet pretrain weight for ImageNet training. [The code line]

If there are options for training ImageNet which I missing, or wrong information, please let me know.

Anyway, Thank you for your wonderful work agin.

parameters for cifar10

hello!
Thanks for the great work!
I 'm interested in your hashing work. I tried to run this code with AlexNet for cifar10,but it didn't work well.
Can you release your parameters for cifar10 or give some advice for running in cifar10 with AlexNet ?

Additional Files for Kinetics400

Hi,

First and foremost, amazing work! Thanks for having the code and models publicly available.

I was wondering if you could have the dataset/Kinetics directory and train_kinetics.py file uploaded as well?

Regards
Arun George

crop size

The crop size is set to be 224x224, which is not the same crop size as in HashNet (227x227). Is there a reason for that?

About loss function

Why is the code different from the loss function in the paper? Q_loss is completely different from LQ in the text.

nus_wide MAP ?

May I know your specific parameters in NUS_ Wide?I use your library's parameters to run out the result is only 0.8188 using resnet model and 64 bit hash code in NUS_ Wide dataset

About ablation study and L_Q loss

Hi,I'm wondering why ablation study in paper doesn't give result of using L_Q loss only.If the result is obviously worse than using L_C only,it will demonstrate what a role that L_C is playing in this task.
Also,I'm also curious about why using a loss that forcely pull prediction points to predefined hash center,which is a singular point that is hard to converge to(and I know that part is talked in paper).I just want to know why create a such a loss function and how much the importance is.
Your reply will be highly appreciated.

Video hashing

Thanks for the great work!

Super excited to play around with video hashing. Are there any plans for releasing the video code and configurations?

AlexNet on COCO

can please share the train parameter for Alexnet on coco?

about the similar result compare

thanks for you work. I test pair image by the model:imagenet_64bit_0.8734_resnet50.pkl. and get the hamming distance:0
This is my test code:

class CSQ(object):
    def __init__(self):
        self.model_path = 'checkpoint/imagenet_64bit_0.8734_resnet50.pkl'
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        self.transform = prep.image_test(resize_size=255, crop_size=224)
        self.load_model()

    def load_model(self, ):
        self.model = torch.load(self.model_path)
        self.model = self.model.module
        self.model.to(device=self.device)
        self.model.eval()

    def forward(self, img_path):
        tensor_img = self.transform(Image.open(img_path).convert('RGB')).unsqueeze(0).to(self.device)
        with torch.no_grad():
            out = self.model(tensor_img)

        hash_code = out.cpu().numpy()
        hash_code[hash_code < 0] = -1
        hash_code[hash_code >= 0] = 1

        code_list = ''.join(['1' if item == 1.0 else '0' for item in hash_code[0].tolist()])
        return code_list

BWWyelZSGvyJldWl8PnfjqLQdYGJL
WvZq4g2S5z4k5e4Po5zxCGV5yO4PWJ

About training my own dataset

How can I get my own 'database. txt', 'train. txt' and 'test. txt' when training my own dataset? What is the relationship between them? For example, are 'train. txt' and 'test. txt' included in 'database. txt'? Then what do the 0 and 1 after the file name in these three files represent?

I really need your answer,Thank you very much!

number of training images for imagenet

Hi, first really appreciate for sharing codes. I have a question about the experimental settings.
in your paper, experimental settings for imagenet is

ImageNet image 10,000 5,000 128,495 100:1

where the number of training images is 10,000. However, the number of training images in this repo is 13,000.
Which one is right to reproduce your result in the paper?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.