Giter Club home page Giter Club logo

vjmht-pytorch's Introduction

Video Joint Modelling Based on Hierarchical Transformer for Co-summarization (VJMHT)

Paper

Haopeng Li, Qiuhong Ke, Mingming Gong, Rui Zhang

IEEE Transactions on Pattern Analysis and Machine Intelligence

Introduction

We propose Video Joint Modelling based on Hierarchical Transformer (VJMHT) for co-summarization, which takes into consideration the semantic dependencies across videos.

VJMHT consists of two layers of Transformer: the first layer extracts semantic representation from individual shots of similar videos, while the second layer performs shot-level video joint modelling to aggregate cross-video semantic information. By this means, complete cross-video high-level patterns are explicitly modelled and learned for the summarization of individual videos.

Moreover, Transformer-based video representation reconstruction is introduced to maximize the high-level similarity between the summary and the original video.

Requirements and Dependencies

  • Python=3.8.5
  • PyTorch=1.9, ortools=8.1.8487

Data Preparation

Download the datasets to datasets/.

Evaluation

Download our models to results/.

Run the following command to test our models.

$ python main.py -c configs/dataset_setting.py --eval

where dataset_setting.py is the configuration file that can be found in configs/. The results are saved in results/DATASET_SETTING/.

Example for testing the model trained on TVSum in the canonical setting:

$ python main.py -c configs/tvsum_can.py --eval

The results are saved in results/TVSUM_CAN.

Training

Run the following command to train the model:

$ python main.py -c configs/dataset_setting.py

Example for training the model on TVSum in the canonical setting:

$ python main.py -c configs/tvsum_can.py

The trained models and results are saved in results/TVSUM_CAN.

License and Citation

The use of this code is RESTRICTED to non-commercial research and educational purposes.

If you use this code or reference our paper in your work please cite this publication as:

@article{li2022video,
  title={Video Joint Modelling Based on Hierarchical Transformer for Co-summarization},
  author={Li, Haopeng and Ke, Qiuhong and Gong, Mingming and Zhang, Rui},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2022},
  publisher={IEEE}
}

Acknowledgement

The code is developed based on VASNet.

vjmht-pytorch's People

Contributors

hoplee6 avatar ok1zjf avatar electroncastle avatar

Stargazers

 avatar kurayami avatar  avatar Eric Quan avatar Tom avatar Jiayang Ao avatar Huy Nguyen avatar Jinhao Li avatar gusyin avatar Sulun Chen avatar  avatar Haoyu avatar  avatar Jake Li avatar  avatar

Watchers

 avatar

Forkers

ruanzhijian

vjmht-pytorch's Issues

visualizations

Hi author, I'd like to know how the visualizations in your article were made, and what importance scores were used? Is there any data in the code that can be utilized, or any operations that need to be performed? Please tell me more about it, thank you!

ge_pkg_versions

Hello, I want to ask what the path name in the class means? Do you need to modify it to your own path?
dep_versions['display'] = run_command('cat/proc/driver/nvidia/version')
print(dep_versions)
dep_versions['cuda'] = 'NA'
cuda_home = '/usr/local/cuda/'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.