This repo is the official PyTorch implementation for the paper Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization (Best Award).
To use this LAV-DF dataset, you should agree the terms and conditions.
Download link: Google Drive.
Method | [email protected] | [email protected] | [email protected] | AR@100 | AR@50 | AR@20 | AR@10 |
---|---|---|---|---|---|---|---|
BA-TFD | 79.15 | 38.57 | 00.24 | 67.03 | 64.18 | 60.89 | 58.51 |
Please note this result is slightly better than the one reported in the paper. This is because we have used the better hyperparameters in this repository.
The main versions are,
- Python >= 3.7, < 3.11
- PyTorch >= 1.9.0
- pytorch_lightning == 1.7.*
Run the following command to install the required packages.
pip install -r requirements.txt
Train the code with default hyperparameter on LAV-DF dataset.
python train.py \
--config ./config/default.toml \
--data_root <DATASET_PATH> \
--batch_size 4 --num_workers 8 --gpus 1 --precision 16
The checkpoint will be saved in ckpt
directory, and the tensorboard log will be saved in lighntning_logs
directory.
Please run the following command to evaluate the model with the checkpoint saved in ckpt
directory.
Besides, you can also download the pretrained model from GitHub Release.
python evaluate.py \
--config ./config/default.toml \
--data_root <DATASET_PATH> \
--checkpoint <CHECKPOINT_PATH>
In the script, there will be a temporal inference results generated in output
directory, and the AP and AR scores will
be printed in the console.
If you find this work useful in your research, please cite it.
@inproceedings{cai2022you,
title={Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization},
author={Cai, Zhixi and Stefanov, Kalin and Dhall, Abhinav and Hayat, Munawar},
booktitle={2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)},
year={2022},
doi={10.1109/DICTA56598.2022.10034605},
pages={1--10},
address = {Sydney, Australia},
}
Some code related to boundary matching mechanism is borrowed from JJBOY/BMN-Boundary-Matching-Network.