Giter Club home page Giter Club logo

finediving-goat-ms's Introduction

FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment

Created by Jinglin Xu*, Yongming Rao*, Xumin Yu, Guangyi Chen, Jie Zhou, Jiwen Lu

This repository contains the FineDiving dataset and PyTorch implementation for Temporal Segmentation Attention. (CVPR 2022)

[Project Page] [[arXiv]](Coming soon) [Dataset] (extract number: 0624)

Dataset

Lexicon

We construct a fine-grained video dataset organized by both semantic and temporal structures, where each structure contains two-level annotations.

  • For semantic structure, the action-level labels describe the action types of athletes and the step-level labels depict the sub-action types of consecutive steps in the procedure, where adjacent steps in each action procedure belong to different sub-action types. A combination of sub-action types produces an action type.

  • In temporal structure, the action-level labels locate the temporal boundary of a complete action instance performed by an athlete. During this annotation process, we discard all the incomplete action instances and filter out the slow playbacks. The step-level labels are the starting frames of consecutive steps in the action procedure.

Annotation

Given a raw diving video, the annotator utilizes our defined lexicon to label each action and its procedure. We accomplish two annotation stages from coarse- to fine-grained. The coarse-grained stage is to label the action type for each action instance and its temporal boundary accompanied with the official score. The fine-grained stage is to label the sub-action type for each step in the action procedure and record the starting frame of each step, utilizing an effective Annotation Toolbox.

The annotation information is saved in FineDiving_coarse_annotation.pkl and FineDiving_fine-grained_annotation.pkl.

Field Name Type Description Field Name Type Description
action_type string Description of the action type. sub-action_types dict Description of the sub-action type.
(x, y) string Instance ID. judge_scores list Judge scores.
dive_score float Diving score of the action instance. frames_labels array Step-level labels of the frames.
difficulty float Difficulty of the action type. steps_transit_frames array Frame index of step transitions.
start_frame int Start frame of the action instance. end_frame int End frame of the action instance.

Statistics

The FineDiving dataset consists of 3000 video samples, covering 52 action types, 29 sub-action types, and 23 difficulty degree types.

Download

We have made the full dataset available on [Baidu Drive] (extract number: 0624).

Code for Temporal Segmentation Attention (TSA)

Requirement

  • Python 3.7.9
  • Pytorch 1.7.1
  • torchvision 0.8.2
  • timm 0.3.4
  • torch_videovision
pip install git+https://github.com/hassony2/torch_videovision

Data Preperation

video_dir = "./FINADiving"           # the path to untrimmed videos (.mp4)
base_dir = "./FINADiving_jpgs"       # the path to untrimmed video frames (.jpgs)
save_dir = "./FINADiving_jpgs_256"   # the path to resized untrimmed video frames (used in our approach) 
  • The data structure should be:
$DATASET_ROOT
├── FineDiving
|  ├── FINADivingWorldCup2021_Men3m_final_r1
|     ├── 0
|        ├── 00489.jpg
|        ...
|        └── 00592.jpg
|     ...
|     └── 11
|        ├── 14425.jpg
|        ...
|        └── 14542.jpg
|  ...
|  └── FullMenSynchronised10mPlatform_Tokyo2020Replays_2
|     ├── 0
|     ...
|     └── 16 
└──

Pretrain Model

The Kinetics pretrained I3D downloaded from the reposity kinetics_i3d_pytorch

model_rgb.pth

Experimental Setting

FineDiving_TSA.yaml

Training and Evaluation

# train a model on FineDiving
bash train.sh TSA FineDiving 0,1

# resume the training process on FineDiving
bash train.sh TSA FineDiving 0,1 --resume

# test a trained model on FineDiving
bash test.sh TSA FineDiving 0,1 ./experiments/TSA/FineDiving/default/last.pth
# last.pth is obtained by train.sh and saved at "experiments/TSA/FineDiving/default/"

Contact: [email protected]

finediving-goat-ms's People

Contributors

alterego238 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.