Giter Club home page Giter Club logo

gpcyclegan's Introduction

Gaze Preserving CycleGAN (GPCyceGAN) & Other Driver Gaze Estimation Models and Datasets

PyTorch implementation of the training and inference procedures described in the papers:

Parts of the CycleGAN code have been adapted from the PyTorch-CycleGAN respository.

Installation

  1. Clone this repository
  2. Install Pipenv:
pip3 install pipenv
  1. Install all requirements and dependencies in a new virtual environment using Pipenv:
cd GPCycleGAN
pipenv install
  1. Get link for desired PyTorch and Torchvision wheel from here and install it in the Pipenv virtual environment as follows:
pipenv install https://download.pytorch.org/whl/cu100/torch-1.2.0-cp36-cp36m-manylinux1_x86_64.whl
pipenv install https://download.pytorch.org/whl/cu100/torchvision-0.3.0-cp36-cp36m-linux_x86_64.whl

Datasets

LISA Gaze Dataset v0

This is the dataset introduced in the papers On Generalizing Driver Gaze Zone Estimation using Convolutional Neural Networks and Driver Gaze Zone Estimation using Convolutional Neural Networks: A General Framework and Ablative Analysis. To use this dataset, do the following:

  1. Download the complete RGB dataset for driver gaze classification using this link.
  2. Unzip the file.

LISA Gaze Dataset v1

This is the second dataset introduced in the paper Driver Gaze Zone Estimation using Convolutional Neural Networks: A General Framework and Ablative Analysis. To use this dataset, do the following:

  1. Download the complete RGB dataset for driver gaze classification using this link.
  2. Unzip the file.

LISA Gaze Dataset v2

This is the dataset introduced in the paper Driver Gaze Estimation in the Real World: Overcoming the Eyeglass Challenge. To use this dataset, do the following:

  1. Download the complete IR+RGB dataset for driver gaze classification using this link.
  2. Unzip the file.
  3. Prepare the train, val and test splits as follows:
python prepare_gaze_data.py --dataset-dir=/path/to/lisat_gaze_data_v2

Training (v0 RGB data)

The best performing SqueezeNet gaze classifier can be trained using the following command:

pipenv shell # activate virtual environment
python gazenet.py --dataset-root-path=/path/to/lisat_gaze_data_v0/ --version=1_1 --snapshot=./weights/squeezenet1_1_imagenet.pth --random-transforms

Training (v1 RGB data)

The best performing SqueezeNet gaze classifier can be trained using the following command:

pipenv shell # activate virtual environment
python gazenet.py --dataset-root-path=/path/to/lisat_gaze_data_v1/ --version=1_1 --snapshot=./weights/squeezenet1_1_imagenet.pth --random-transforms

Training (v2 IR data)

The prescribed three-step training procedure can be carried out as follows:

Step 1: Train the gaze classifier on images without eyeglasses

pipenv shell # activate virtual environment
python gazenet.py --dataset-root-path=/path/to/lisat_gaze_data_v2/ir_no_glasses/ --version=1_1 --snapshot=./weights/squeezenet1_1_imagenet.pth --random-transforms

Step 2: Train the GPCycleGAN model using the gaze classifier from Step 1

python gpcyclegan.py --dataset-root-path=/path/to/lisat_gaze_data_v2/ --data-type=ir --version=1_1 --snapshot-dir=/path/to/trained/gaze-classifier/directory/ --random-transforms

Step 3.1: Create fake images using the trained GPCycleGAN model

python create_fake_images.py --dataset-root-path=/path/to/lisat_gaze_data_v2/ir_all_data/ --snapshot-dir=/path/to/trained/gpcyclegan/directory/
cp /path/to/lisat_gaze_data_v2/ir_all_data/mean_std.mat /path/to/lisat_gaze_data_v2/ir_all_data_fake/mean_std.mat # copy over dataset mean/std information to fake data folder

Step 3.2: Finetune the gaze classifier on all fake images

python gazenet-ft.py --dataset-root-path=/path/to/lisat_gaze_data_v2/ir_all_data_fake/ --version=1_1 --snapshot-dir=/path/to/trained/gpcyclegan/directory/ --random-transforms
exit # exit virtual environment

Training (v2 RGB data)

The prescribed three-step training procedure can be carried out as follows:

Step 1: Train the gaze classifier on images without eyeglasses

pipenv shell # activate virtual environment
python gazenet.py --dataset-root-path=/path/to/lisat_gaze_data_v2/rgb_no_glasses/ --version=1_1 --snapshot=./weights/squeezenet1_1_imagenet.pth --random-transforms

Step 2: Train the GPCycleGAN model using the gaze classifier from Step 1

python gpcyclegan.py --dataset-root-path=/path/to/lisat_gaze_data_v2/ --data-type=rgb --version=1_1 --snapshot-dir=/path/to/trained/gaze-classifier/directory/ --random-transforms

Step 3.1: Create fake images using the trained GPCycleGAN model

python create_fake_images.py --dataset-root-path=/path/to/lisat_gaze_data_v2/rgb_all_data/ --snapshot-dir=/path/to/trained/gpcyclegan/directory/
cp /path/to/lisat_gaze_data_v2/rgb_all_data/mean_std.mat /path/to/lisat_gaze_data_v2/rgb_all_data_fake/mean_std.mat # copy over dataset mean/std information to fake data folder

Step 3.2: Finetune the gaze classifier on all fake images

python gazenet-ft.py --dataset-root-path=/path/to/lisat_gaze_data_v2/rgb_all_data_fake/ --version=1_1 --snapshot-dir=/path/to/trained/gpcyclegan/directory/ --random-transforms
exit # exit virtual environment

Inference (v0 RGB data)

Inference can be carried out using this script as follows:

pipenv shell # activate virtual environment
python infer.py --dataset-root-path=/path/to/lisat_gaze_data_v0/ --split=val --version=1_1 --snapshot-dir=/path/to/trained/rgb-model/directory/ --save-viz
exit # exit virtual environment

Inference (v1 RGB data)

Inference can be carried out using this script as follows:

pipenv shell # activate virtual environment
python infer.py --dataset-root-path=/path/to/lisat_gaze_data_v0/ --split=val --version=1_1 --snapshot-dir=/path/to/trained/rgb-model/directory/ --save-viz
exit # exit virtual environment

Inference (v2 IR data)

Inference can be carried out using this script as follows:

pipenv shell # activate virtual environment
python infer.py --dataset-root-path=/path/to/lisat_gaze_data_v2/ir_all_data/ --split=test --version=1_1 --snapshot-dir=/path/to/trained/ir-models/directory/ --save-viz
exit # exit virtual environment

Inference (v2 RGB data)

Inference can be carried out using this script as follows:

pipenv shell # activate virtual environment
python infer.py --dataset-root-path=/path/to/lisat_gaze_data_v2/rgb_all_data/ --split=val --version=1_1 --snapshot-dir=/path/to/trained/rgb-models/directory/ --save-viz
exit # exit virtual environment

Pre-trained Weights

You can download our pre-trained model weights using this link.

Config files, logs, results, snapshots, and visualizations from running the above scripts will be stored in the GPCycleGAN/experiments folder by default.

Citations

If you find our data, code, and/or models useful in your research, please consider citing the following papers:

@inproceedings{vora2017generalizing,
  title={On generalizing driver gaze zone estimation using convolutional neural networks},
  author={Vora, Sourabh and Rangesh, Akshay and Trivedi, Mohan M},
  booktitle={2017 IEEE Intelligent Vehicles Symposium (IV)},
  pages={849--854},
  year={2017},
  organization={IEEE}
}

@article{vora2018driver,
  title={Driver gaze zone estimation using convolutional neural networks: A general framework and ablative analysis},
  author={Vora, Sourabh and Rangesh, Akshay and Trivedi, Mohan Manubhai},
  journal={IEEE Transactions on Intelligent Vehicles},
  volume={3},
  number={3},
  pages={254--265},
  year={2018},
  publisher={IEEE}
}

@article{rangesh2020driver,
  title={Gaze Preserving CycleGANs for Eyeglass Removal & Persistent Gaze Estimation},
  author={Rangesh, Akshay and Zhang, Bowen and Trivedi, Mohan M},
  journal={arXiv preprint arXiv:2002.02077},
  year={2020}
}

gpcyclegan's People

Contributors

arangesh avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.