Giter Club home page Giter Club logo

rt-st-action-localization's Introduction

Learning Motion Representation for Real-Time Spatio-Temporal Action Localization

An Pytorch implementation of our work.

We built our work based on Pytorch implementation of Online Real-time Multiple Spatiotemporal Action Localisation and Prediction.

Environment

  • Ubuntu 16.04
  • Python 3.6
  • CUDA 8.0
  • CuDNN 7.1
  • Pytorch 0.4.0
  • Opencv 3.4
  • Matlab 2016b (if you need to compute the video-frame level)

Training

We use the official Pytorch implementation of PWC-Net as our flow subnet. (Notes: The PWC-Net repo is developed using Python 2.7 & Pytorch 0.2.0 & CUDA 8.0. We test several configurations to use the Pytorch implementation. Current environment can run this code correctly). You can use train-*.py scripts to train the whole network (we recommend to use train-ucf24-apex.py which is much faster but a little accuracy drop).

We use 4 GTX 1080ti graphics cards to train the network with 32 batch-sizes.

Testing

Frame-level

You can use val-ucf24.py to evaluate the frame-level mAP

Video-level

The video-level evaluation coda is in ./matlab-online-display. You can run myI01onlineTubes.m to produce the video-level results.

References

  • [1] Wei Liu, et al. SSD: Single Shot MultiBox Detector. ECCV2016.
  • [2] S. Saha, G. Singh, M. Sapienza, P. H. S. Torr, and F. Cuzzolin, Deep learning for detecting multiple space-time action tubes in videos. BMVC 2016
  • [3] G. Singh, S Saha, M. Sapienza, P. H. S. Torr and F Cuzzolin. Online Real time Multiple Spatiotemporal Action Localisation and Prediction. ICCV, 2017.
  • [4] Deqing Sun and Xiaodong Yang and Ming-Yu Liu and Jan Kautz. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume. CVPR, 2018.
  • [5] Liu, Songtao and Huang, Di and Wang, and Yunhong. Receptive Field Block Net for Accurate and Fast Object Detection. The European Conference on Computer Vision (ECCV).
  • Original SSD Implementation (CAFFE)
  • A huge thanks to Max deGroot, Ellis Brown for Pytorch implementation of SSD.
  • A huge thanks to Gurkirt Singh for Online Real-time Multiple Spatiotemporal Action Localisation and Prediction Pytorch implementation ROAD.

rt-st-action-localization's People

Contributors

fpsandnoob avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.