rshaojimmy / multimodal-deepfake Goto Github PK
View Code? Open in Web Editor NEW[TPAMI 2024 & CVPR 2023] PyTorch code for DGM4: Detecting and Grounding Multi-Modal Media Manipulation and beyond
License: Other
[TPAMI 2024 & CVPR 2023] PyTorch code for DGM4: Detecting and Grounding Multi-Modal Media Manipulation and beyond
License: Other
Hi, I was wondering how to get the region of FA. In the paper's Section 3.2 (Face Attribute Manipulation), the author mentioned, "we first predict ..... using GAN-based methods". Can I understand this as the author first applies an expression detector to the face to get the expression region (e.g., "smile mouth"), then employs StyleCLIP to modify the expression of the face, and finally replaces the original expression region with the modified expression region?
How many face attributes does the dataset provide (e.g., "smile to angry")? what is the expression detector? and is the process of replacing as simple as copy and paste?
Thanks!
Hello, sorry for the interruption. I'm encountering torch.multiprocessing.spawn.ProcessExitedException: process 1 terminated with signal SIGABRT while running train.sh script, may I ask for your insights!
您好,我在使用您的数据集和模型时出现了达不到您论文中提到的结果,请问您当时详细的训练参数是什么呢?
Hello,
I am trying to train but out of memory always occured. Could you tell me how much memory you used and how much time did you spend for training?
Hello, the link to the DGM4 dataset is showing an error: 'This link has been removed.'T^T Is there any other way to access the dataset?
Thank you for the new insight in multimodel deepfake detection! When running test.py, it will report an error -- TypeError: add_code_sample_docstrings() got an unexpected keyword argument 'tokenizer_class'.
Nice job! Will be the well-trained checkpoints publicly avaliable?
Dear,
I am very interested in your task, but now the dataset link is invalid. Can you please resend the link, or can you please send it to my email?My email address is [email protected],thanks!
Hi, I have downloaded your DGM4 dataset directly via the link, but after checking, I only found the images in the file 'manipulation' and 'origin', which is different from your dataset samples.
Dear Sir,
I‘m going to reproduce your work and use the pretrained best checkpoint for transfer learning, but struggling to check the config parameters among config/*.yaml
, *.sh
shell scripts and the parser.add_argument()
in python scripts back and forth.
I think aggregating all these configs into yaml files is a more delightful and elegant way with more readabilty and more convenience for others to use your checkpoint.
Appreciate it.
Thanks for your awesome work!
I was wondering when comparing to the deepfake detection and sequence tagging methods, do you retrain the model using uni-modal data? If it is re-training, are the multi-modal-related modules such as Multi-Modal Aggregator deleted, or are all 0 data used to replace the input of another modality?
When I do sh train.sh operation, I get the error that the train_ours.py cannot be found, how can I fix it
Lmac中文本投影text_feat你们使用的是bert的最后一层,而非cls,这样的选取是因为你们做实验比较性能了吗?
我只是更改了一些关于多分布训练的代码,因为我是单gpu,减半了训练的批次大小。其余的没有改动。您能为我提供一些意见吗?
Hi, nice work. But I always get error when directly download the dataset on Microsoft365. Is there another way to download data. Thanks a lot.
Hi,
I have read your paper and code and was deeply impressed. But I had some difficulty trying to reproduce your code, how do you visualize it? In order to get the same results as you show in the Visualization Results section ,how could I reproduce the code?
thanks
Hi everyone!
I tried to download best-model-checkpoint but every time it failed!
Would you please update the link or provide another source?
I am trying out the training codes that you have provided. I am not using a distributed GPU system, here is the config I am using.
*i converted the argparse code for the distributed argument to default to False
But I have encountered this error:
Start training
Traceback (most recent call last):
File "train.py", line 561, in <module>
mp.spawn(main_worker, nprocs=ngpus_per_node, args=(args, config))
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/train.py", line 416, in main_worker
train_stats = train(args, model, train_loader, optimizer, tokenizer, epoch, warmup_steps, device, lr_scheduler, config, summary_writer)
File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/train.py", line 141, in train
loss_MAC, loss_BIC, loss_bbox, loss_giou, loss_TMG, loss_MLC = model(image, label, text_input, fake_image_box, fake_token_pos, alpha = alpha)
File "/anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/models/HAMMER.py", line 211, in forward
self._dequeue_and_enqueue(image_feat_m, text_feat_m)
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/models/HAMMER.py", line 363, in _dequeue_and_enqueue
image_feats = concat_all_gather(image_feat)
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/models/HAMMER.py", line 386, in concat_all_gather
for _ in range(torch.distributed.get_world_size())]
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 845, in get_world_size
return _get_group_size(group)
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 306, in _get_group_size
default_pg = _get_default_group()
File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 410, in _get_default_group
raise RuntimeError(
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
Do correct me if im wrong, but I have already specified distributed to be False, why are the errors still referencing distributed codes?
Here is the change I made to argparse:
parser.add_argument('--distributed', default=False, action='store_true')
The download link of datasets has been removed. Could you share a new one? Thank you!
I have read your paper and code and was deeply impressed. But I had some difficulty trying to reproduce your code, how do you visualize it? I used the grad-cam module and felt that it was difficult to integrate into this project.
can we test our own videos in it or how to pre-process tests video to get metadata.json file
I want to get more details on text_swap. I have found that some text of the 'orig' label and the 'text_swap' label are the same in the datasets. Can you provide a more detailed explanation of text_swap and its fake_text_pos?
here is an example:
{
"id": 683133,
"image": "DGM4/origin/guardian/0385/488.jpg",
"text": "Making a song and dance David Hasselhoff will perform a oneman show at the Edinburgh festival fringe",
"fake_cls": "orig",
"fake_image_box": [],
"fake_text_pos": [],
"mtcnn_boxes": [...]
},
{
"id": 896499,
"image": "DGM4/origin/guardian/0114/251.jpg",
"text": "Making a song and dance David Hasselhoff will perform a oneman show at the Edinburgh festival fringe",
"fake_cls": "text_swap",
"fake_image_box": [],
"fake_text_pos": [
0,
7,
8,
9,
10,
11,
13,
14,
15,
16
],
"mtcnn_boxes": [...]
}
Could you please re-share the connection of the data set? It's not working now
我想在celeb-df数据集上测试一下hammer的性能,但我发现如果去掉文本模态,那整个模型不就仅仅相当于通过一个图像编码器然后进行分类了吗?我的理解对吗?所以hammer在单模态数据集上测试意义不大,可以这样说吗?
how to run the train.sh when using VScode it shows no command found but if i run in gitbash it shows
$ sh train.sh
Traceback (most recent call last):
File "train.py", line 18, in v$ sh train.sh
Traceback (most recent call last):
File "train.py", line 18, in
import torch.nn as nn
File "C:\Users\athen\anaconda3\envs\DGM4\lib\site-packages\torch\nn_init_.py", line 1, in
from .modules import * # noqa: F403
File "C:\Users\athen\anaconda3\envs\DGM4\lib\site-packages\torch\nn\modules_init_.py", line 1, in
from .module import Module
File "C:\Users\athen\anaconda3\envs\DGM4\lib\site-packages\torch\nn\modules\module.py", line 7, in
from ..parameter import Parameter
File "C:\Users\athen\anaconda3\envs\DGM4\lib\site-packages\torch\nn\parameter.py", line 2, in
from torch._C import _disabled_torch_function_impl
ModuleNotFoundError: No module named 'torch._C'
I'm running this normal cpu I5 processor
Hi, thanks for your wonderful work!
But when I ran train.sh, I encountered an error:
I checked the type before and in model.forward()
, I found before forward()
, the type of text.input_ids
is correct, i.e., torch.LongTensor, but in forward()
it changes to list. Can you help me find out where the mistake is? thanks very much!
Hi! May I ask for the evaluation results of CLIP and ViLT on the DGM4 val subset, like Table 2 in your papter? Thanks. 😊
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.