Second-order Democratic Aggregation

Created by Tsung-Yu Lin, Subhransu Maji and Piotr Koniusz.

Introduction

This repository contains the code for reproducing the results in our ECCV 2018 paper:

@inproceedings{lin2018o2dp,
    Author = {Tsung-Yu Lin and Subhransu Maji and Piotr Koniusz},
    Title = {Second-order Democratic Aggregation},
    Booktitle = {European Conference on Computer Vision (ECCV)},
    Year = {2018}
}

The paper analyzes various feature aggregators in the context of second-order features and proposes γ-democratic pooling which generalizes sum pooling and democratic aggregation. See the project page and the paper for the detail. The code is tested on Ubuntu 14.04 using NVIDIA Titan X GPU and MATLAB R2016a.

Prerequisite

MatConvNet: Our code was developed on the MatConvNet version 1.0-beta24.
VLFEAT
bcnn-package: The package includes our implementation of customized layers.

The packages are set up as the git submodules. Check them out by the following commands and follow the instructions on MatConvNet and VLFEAT project pages to install them.

>> git submodule init
>> git submodule update

Datasets

To run the experiments, download the following datasets and edit the model_setup.m file to point them to the dataset locations. For instance, you can point to the birds dataset directory by setting opts.cubDir = 'data/cub'.

Fine-grained classification datasets:

Texture and indoor scene datasets:

Pre-trained models

ImageNet LSVRC 2012 pre-trained models: The vgg-verydeep-16 and reset-101 ImageNet pre-trained models are used as our basic models. Download them from MatConvNet pre-trained models page.
B-CNN fine-tuned models: We also provide the B-CNN fine-tuned models with vgg-verydeep-16 from which we can extract the CNN features and aggregate them to construct the image descriptor. Download the models for CUB Birds, FGVC Aircrafts, or Stanford Cars to reproduce the accuracy provided in the paper.

Testing the models:

Solving the coefficients for γ-democratic aggregation involves sinkhorn iteration. The hyperparameters for the sinkhorn iteration are configurable in the entry codes run_experiments_o2dp.m and run_experiments_sketcho2dp_resnet.m. See the comment in the code for the detail.

Second-order γ-democratic aggregation: Point the variable model_path to the location of the model in run_experiments_o2dp.m and run the command run_experiments_o2dp(dataset, gamma, gpuidx) in matlab terminal.

For example:

% gamma is the hyper-parameter gamma for gamma-democratic aggregation
% gpuidx is the index of gpu on which you run the experiment
run_experiments_o2dp('mit_indoor', 0.3, 1)

Classification results: Sum and democratic aggregation can be achieved by setting the proper values of γ. The optimal γ values are indicated in the parenthesis. In general γ=0.5 performs reasonably well. For DTD and FMD these numbers are reported on the first split. For the fine-grained recognition datasets (†) the results are obtained by using the fine-tuned B-CNN models while for the texture and indoor scene datasets the ImageNet pre-trained vgg-verydeep-16 model is used.

Dataset	Sum(γ=1)	Democratic(γ=0)	γ-democratic
Caltech UCSD Birds †	84.0	84.7	84.9 (0.5)
Stanford Cars †	90.6	89.7	90.8 (0.5)
FGVC Aircrafts †	85.7	86.7	86.7 (0.0)
DTD	71.2	72.2	72.3 (0.3)
FMD	84.6	82.8	84.8 (0.8)
MIT Indoor	79.5	79.6	80.4 (0.3)

Second-order γ-democratic aggregation in sketch space: Point the variable model_path to the location of the model in run_experiments_sketcho2dp_resnet.m and run the command run_experiments_sketcho2dp_resnet(dataset, gamma, d, gpuidx) in matlab terminal.
- For example:
```
% gamma is the hyper-parameter gamma for gamma-democratic aggregation
% d is the dimension for the sketch space
% gpuidx is the index of gpu on which you run the experiment
run_experiments_sketcho2dp_resnet('mit_indoor', 0.5, 8192, 1) 
```
- The script aggregates the second-order ResNet features pre-trained on ImageNet in a 8192-dimensional sketch space with γ-democratic aggregator. With ResNet features the model achieves the following results. For DTD and FMD the accuracy is averaged over 10 splits.
  
  DTD FMD MIT Indoor
  
  Accuracy 76.2 ∓ 0.7 84.3 ∓ 1.5 84.3

mygit007hub / o2dp Goto Github PK

o2dp's Introduction

Second-order Democratic Aggregation

Introduction

Prerequisite

Datasets

Fine-grained classification datasets:

Texture and indoor scene datasets:

Pre-trained models

Testing the models:

o2dp's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent