peachypie98 / rivagan Goto Github PK

RivaGAN: Robust Invisible Video Watermarking with Attention

Shell 0.53% Python 99.47%

video-watermarking watermarking rivagan deep-learning

rivagan's Introduction

RivaGAN PyTorch (Unofficial)

😯 Before We Start...

This repository is created to assist people encountering difficulties running the offical repository from DAI-Lab, as the official one has not received updates for the past several years. Additionally, I have optimized the official version to enhance execution speed while maintaining overall performance integrity. Lastly, I have conducted testing on Windows 11 using the latest Python 3.11 and PyTorch 2.0.1

😀 Prerequisites

Install PyTorch, Numpy, OpenCV, Pandas, ArgParse
Install GitBash from https://git-scm.com/
Install wget from https://gnuwin32.sourceforge.net/packages/wget.htm
Install Torch DCT using pip install torch_dct

🤪 Let's Get Started!

Clone this repository
Open GitBash Terminal and Download Hollywood2 Training Dataset
Acquiring the dataset may require several hours, depending upon the speed of your internet connection
```
cd data
bash download.sh
```
Train RivaGAN Model
The hyperparameter settings align with the official specifications and are currently configured to their default values
```
python train.py 
python train.py --epochs 200 --lr 0.001 --data_dim 64 
```
Default Hyperparameters Details:
- --epochs: 300
- --train_batch: 12
- --lr: 0.0005
- --num_workers: 16
- --data_dim: 32
- --use_critic: True
- --use_adversary: True
- --use_noise: True
- --use_bit_inverse: True
Inference RivaGAN Model
After completing the model training, our objective is to encode a data watermark onto a video and subsequently extract it from the encoded footage. After the inference process, it will generate output_log.txt file, providing a detailed record of the extracted data from each frame in the video and a watermarked video that contains the data
```
python inference.py --model_weight your_weight_path/model.pt
python inference.py --model_weight your_weight_path/model.pt --random_data No --your_data "1100 1001 0011 0000 1111 0101 1100 0011" --fps 30
```
Default Hyperparameters Details:
- --data_dim: 32
  - The data dimensions must correspond with the dimensions used during the model training
- --model_weight: None
  - Must be added
- --random_data: Yes
- --your_data: None
  - Set --random_data to No to use your own data
- --video_location: ./data/hollywood2/val/actioncliptest00002.avi
- --fps: 25
  - Watermaked video output FPS

Changelog

2024-05-23

make_pair function has been reverted to its original code due to instability issues during training

2024-04-18

Incorporated pre-trained RivaGAN model, which was trained using 32-bit data dimensions

2023-01-23

Enhanced code optimization for encoding and decoding processes (3.5 ~ 4x speed increase)

rivagan's People

Contributors

Stargazers

Watchers

rivagan's Issues

loss一直不收敛

运行了模型下载了数据跑了三百论 loss一直是1.3 精度也不高只有0.6 左右这种怎么调节去解决改变了学习率那个loss也降不下

Saving the model with state_dict

Hi, thanks for the amazing work.

I was wondering if you could help me in modifying the code so that the model is saved by using its state_dict, instead of saving the whole model, as PyTorch documentation suggests (https://pytorch.org/tutorials/beginner/saving_loading_models.html).

Since the RivaGAN class does not extend directly the nn.Module, I tought to simply modify it from class RivaGAN(object) to class RivaGAN(nn.Module)

and then change from

torch.save(self, os.path.join(log_dir, "model.pt"))

torch.save(self.state_dict(), os.path.join(log_dir, "model.pt"))

but maybe I'm missing something and I can't figure out if it's enough for the model to work properly or if this modification could alter the model behavior when loaded (since the RivaGAN class is composed by multiple sub-modules).

Do you have any suggestion?
Thanks

Checkpoints

Thank you for sharing the code! Do you have trained checkpoints or is there a way to use their official checkpoint, which ends with .onnx? Thanks!

Robustness to horizontal flipping

Hi, thank you for this amazing work of refactoring the original source code of RivaGAN.

I was wondering if you have any suggestion on how to give more robustness to the model against horizontal flipping attacks.

I trained the model, but after some experiments I noticed that flipping the video leads to very poor accuracy in watermark extraction.

I tried to implement a simple noise layer like this:

class HorizontalFlipping(nn.Module):
    """
    Simulates a transformation where the input is mirrored horizontally.

    Input: (N, 3, L, H, W)
    Output: (N, 3, L, H, W)
    """

    def __init__(self):
        super(HorizontalFlipping, self).__init__()

    def forward(self, frames):
        #return frames.flip(dims=(-1,))
        return frames.flip([4]) #flips along the width dimension

and then I added it inside the training and validation code, similarly to the others noise layers already implemented (Crop, Compression, Scale).

The problem is, when I run a new training with such modifications, the various accuracies have a worsening (included the validation crop accuracy, scale accuracy, ...).

Do you have any suggestions?
Thanks a lot