erprogs / genconvit Goto Github PK

View Code? Open in Web Editor NEW

38.0 38.0 9.0 60.84 MB

Deepfake Video Detection Using Generative Convolutional Vision Transformer

License: GNU General Public License v3.0

Python 100.00%

genconvit's People

Contributors

Stargazers

Watchers

Forkers

ai-ron-man dl-vit yjsai dishabhoje nguyenvulong jagannath-p-s qbqkwy devarsyashah

genconvit's Issues

What is the version of CUDA?

I use the cuda 12.2, but it doesn't work. Then I try to use the torch 2.2.1(for cuda 12.2).

python train.py --d sample_train_data --m vae -e 5 -t y

I got this "RuntimeError: GET was unable to find an engine to execute this computation"

How to test the trained model's auc.

Processed Dataset

Hi, I am very interested in the 'Deepfake Video Detection Using Generative Convolutional Vision Transformer' research. Could you please provide the processed dataset? Thank you very much!!!

What is the version of CUDA? I use the cuda 12.2, but it doesn't work. Then I try to use the torch 2.2.1(for cuda 12.2).I got this "RuntimeError: GET was unable to find an engine to execute this computation"

The trained model cannot be used for prediction

I used training scripts to get my VAE model: python train.py --d sample_train_data --m vae -e 5 -t y

Then I changed the path of the vae model in pred-func.py, but using the predicted script gave me an error：Missing key(s) in state_dict and Unexpected key(s).The structure of the two models is inconsistent.
I also found that the model size of the saved VAE was not consistent with the pre-trained model size you provided. The model size of my trained VAE became twice as large, about 5G in size.
I sincerely hope that you can help me, how to use their own trained model for inference.

paper of this project

do great work but why not publish paper of this work

Questions about Labels

Hello,
Thank you for the great end-to-end approach.

May I ask about the labels in this code (I am aware of the output but I'd like to clarify)

the first value of torch.sigmoid(model(df).squeeze()) is fake probability, right?
For example, after the sigmoid function we have tensor([[0.0468, 0.9539]], device='cuda:0'), then 0.0468 is the chance of the sample being fake, correct?

def pred_vid(df, model):
    with torch.no_grad():
        y_pred = model(df)
        return max_prediction_value(torch.sigmoid(model(df).squeeze()))

Also, the code is a bit counterintuitive... why we have to XOR?
Since the prediction is 0, then the label then become 1, which is FAKE. Could you please explain.

def real_or_fake(prediction):
    return {0: "REAL", 1: "FAKE"}[prediction ^ 1]

Mismatch of the downloaded ckpt and the architecture

Hi, Thank you for open sourcing your project. I have downloaded the provided checkpoints for both ed and vae and placed them inside the weight folder. However, I get the following errro:

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for GenConViTED: Missing key(s) in state_dict: "backbone.patch_embed.backbone.layers.3.downsample.norm.weight", "backbone.patch_embed.backbone.layers.3.downsample.norm.bias", "backbone.patch_embed.backbone.layers.3.downsample.reduction.weight", "backbone.patch_embed.backbone.head.fc.weight", "backbone.patch_embed.backbone.head.fc.bias", "embedder.layers.3.downsample.norm.weight", "embedder.layers.3.downsample.norm.bias", "embedder.layers.3.downsample.reduction.weight", "embedder.head.fc.weight", "embedder.head.fc.bias". Unexpected key(s) in state_dict: "backbone.patch_embed.backbone.layers.0.downsample.norm.weight", "backbone.patch_embed.backbone.layers.0.downsample.norm.bias", "backbone.patch_embed.backbone.layers.0.downsample.reduction.weight", "backbone.patch_embed.backbone.layers.0.blocks.0.attn.relative_position_index", "backbone.patch_embed.backbone.layers.0.blocks.1.attn_mask", "backbone.patch_embed.backbone.layers.0.blocks.1.attn.relative_position_index", "backbone.patch_embed.backbone.layers.1.blocks.0.attn.relative_position_index", "backbone.patch_embed.backbone.layers.1.blocks.1.attn_mask", "backbone.patch_embed.backbone.layers.1.blocks.1.attn.relative_position_index", "backbone.patch_embed.backbone.layers.2.blocks.0.attn.relative_position_index", "backbone.patch_embed.backbone.layers.2.blocks.1.attn_mask", "backbone.patch_embed.backbone.layers.2.blocks.1.attn.relative_position_index", "backbone.patch_embed.backbone.layers.2.blocks.2.attn.relative_position_index", "backbone.patch_embed.backbone.layers.2.blocks.3.attn_mask", "backbone.patch_embed.backbone.layers.2.blocks.3.attn.relative_position_index", "backbone.patch_embed.backbone.layers.2.blocks.4.attn.relative_position_index", "backbone.patch_embed.backbone.layers.2.blocks.5.attn_mask", "backbone.patch_embed.backbone.layers.2.blocks.5.attn.relative_position_index", "backbone.patch_embed.backbone.layers.3.blocks.0.attn.relative_position_index", "backbone.patch_embed.backbone.layers.3.blocks.1.attn.relative_position_index", "backbone.patch_embed.backbone.head.weight", "backbone.patch_embed.backbone.head.bias", "embedder.layers.0.downsample.norm.weight", "embedder.layers.0.downsample.norm.bias", "embedder.layers.0.downsample.reduction.weight", "embedder.layers.0.blocks.0.attn.relative_position_index", "embedder.layers.0.blocks.1.attn_mask", "embedder.layers.0.blocks.1.attn.relative_position_index", "embedder.layers.1.blocks.0.attn.relative_position_index", "embedder.layers.1.blocks.1.attn_mask", "embedder.layers.1.blocks.1.attn.relative_position_index", "embedder.layers.2.blocks.0.attn.relative_position_index", "embedder.layers.2.blocks.1.attn_mask", "embedder.layers.2.blocks.1.attn.relative_position_index", "embedder.layers.2.blocks.2.attn.relative_position_index", "embedder.layers.2.blocks.3.attn_mask", "embedder.layers.2.blocks.3.attn.relative_position_index", "embedder.layers.2.blocks.4.attn.relative_position_index", "embedder.layers.2.blocks.5.attn_mask", "embedder.layers.2.blocks.5.attn.relative_position_index", "embedder.layers.3.blocks.0.attn.relative_position_index", "embedder.layers.3.blocks.1.attn.relative_position_index", "embedder.head.weight", "embedder.head.bias". size mismatch for backbone.patch_embed.backbone.layers.1.downsample.norm.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for backbone.patch_embed.backbone.layers.1.downsample.norm.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for backbone.patch_embed.backbone.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([192, 384]). size mismatch for backbone.patch_embed.backbone.layers.2.downsample.norm.weight: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for backbone.patch_embed.backbone.layers.2.downsample.norm.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for backbone.patch_embed.backbone.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([768, 1536]) from checkpoint, the shape in current model is torch.Size([384, 768]). size mismatch for embedder.layers.1.downsample.norm.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for embedder.layers.1.downsample.norm.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for embedder.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([192, 384]). size mismatch for embedder.layers.2.downsample.norm.weight: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for embedder.layers.2.downsample.norm.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for embedder.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([768, 1536]) from checkpoint, the shape in current model is torch.Size([384, 768]).

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.