FunQA: Towards Surprising Video Comprehension

FunQA_Trailer_with_audio.mp4

Welcome to FunQA's Codebase Repository!

This repo provides the code for evaluating your model's output (json file).

Introducing FunQA

The motivation for the FunQA is straightforward: Humans enjoy surprising videos, including funny clips, creative performances, or visual illusions. We aim to evaluate and empower AI models with similar capabilities.

FunQA is a VideoQA dataset to evaluate and enhance the model's video reasoning capability upon counter-intuitive videos, including humorous and funny viral videos from TikTok, creative performance from Kasou Taishou (欽ちゃん＆香取慎吾の全日本仮装大賞), and magic videos from YouTube and TikTok.

We establish rigorous QA tasks designed to assess the model's capability in counter-intuitive timestamp localization, detailed video description, and reasoning around counter-intuitiveness. We also pose higher-level tasks, such as attributing a fitting and vivid title to the video, and scoring the video creativity.

In total, the FunQA benchmark consists of 312K free-text QA pairs derived from 4.3K video clips, spanning a total of 24 video hours. Extensive experiments with existing VideoQA models reveal significant performance gaps for the FunQA videos across spatial-temporal reasoning, visual-centered reasoning, and free-text generation.

Updates

16 June, 2023: 💥💥 The FunQA challenge with $1M prize starts! At the same time, we released the evaluation code.

Todo

Release the FunQA dataset and arXiv paper.
Release evaluation code.
Release the FunQA Extended dataset.

1. FunQA Benchmark
- 1.1 FunQA Main Tasks
- 1.2 FunQA Extended Dataset
2. Data Preparation
3. Acknowledgement
4. License

1 - FunQA Benchmark

1.1 - FunQA Main Tasks

FunQA comprises three subsets of surprising videos: 1) HumorQA, 2) CreativeQA, and 3) MagicQA. Each subset is associated with three common tasks: 1) counter-intuitive timestamp localization, 2) detailed video description, and 3) reasoning around counter-intuitiveness (see H1-3, C1-3, and M1-3). Furthermore, we offer higher-level tasks tailored for each video type, such as attributing a fitting and vivid title for HumorQA and CreativeQA (see H4, C4), etc.

1.2 - FunQA Extended Tasks

FunQA Multi-choice Dataset

FunQA Multi-choice Dataset is prepared to provide training and testing for arbitrary models, in this dataset our QA pairs are in the form of multiple choice, the answer is a word, phrase, or short sentence, and the type of questions are all descriptions.

FunQA Dialog Dataset

Most of the current LLMs are in the form of dialogues. To cater to their data input, we produced the FunQA Dialog dataset, in which we used GPT-3.5 to convert QA pairs into recursive dialogues with added context.

2 - Data Preparation

Please download all the videos and annotation files from here.

For FunQA Dataset: there are four zip files:

train.zip, val.zip, test.zip: Videos for training, validation and test.
annotation_with_id.zip: Annotation files for FunQA Base Dataset.

For FunQA Extension Dataset: Coming soon.

3 - Evaluation

cd FunQA
conda create -n funqa python=3.10

# install bleurt
git clone https://github.com/google-research/bleurt.git
cd bleurt
pip install .


# download recommended checkpoint for bleurt

wget https://storage.googleapis.com/bleurt-oss-21/BLEURT-20.zip .
unzip BLEURT-20.zip

pip install -r requirements.txt
conda activate funqa

Please move archive bleurt/bleurt to bleurt/ Then edit and run ./scripts/run_classic_eval.sh and ./scripts/run_gpt4_eval.sh for evalution.

Acknowledgement

This study is supported by the Ministry of Education, Singapore, under its MOE AcRF Tier 2 (MOE-T2EP20221- 0012), NTU NAP, and under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).

If you're using FunQA in your research or applications, please cite using this BibTeX:

    @article{xie2023funqa,
      title={FunQA: Towards Surprising Video Comprehension},
      author={Xie, Binzhu and Zhang, Sicheng and Zhou, Zitang and Li, Bo and Zhang, Yuanhan and Hessel, Jack and Yang, Jingkang and Liu, Ziwei},
      journal={GitHub repository},
      year={2023},
      howpublished = {\url{https://github.com/Jingkang50/FunQA}}
  }

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Looking forward to your feedback and please raise any issues or questions here.

jingkang50 / funqa Goto Github PK

funqa's Introduction

FunQA: Towards Surprising Video Comprehension

Introducing FunQA

Updates

Todo

Table of Contents

1 - FunQA Benchmark

1.1 - FunQA Main Tasks

1.2 - FunQA Extended Tasks

FunQA Multi-choice Dataset

FunQA Dialog Dataset

2 - Data Preparation

3 - Evaluation

Acknowledgement

License

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent