opentalker / styleheat Goto Github PK

View Code? Open in Web Editor NEW

606.0 606.0 73.0 44.48 MB

[ECCV 2022] StyleHEAT: A framework for high-resolution editable talking face generation

License: MIT License

Shell 0.37% Python 96.83% C++ 0.35% Cuda 2.44%

styleheat's People

Contributors

Stargazers

Watchers

styleheat's Issues

Code releasing

Hi. I was fascinated your project.
Thank you for showing nice work.
so I want to run and test your code.
when do you release the code??
I look forward your code release.

expected date for code release?

It's been quite a while since the author created this repo. Do you have expected date for code release?

Question about training the video calibrator.

I'm trying to train video calibrator with my dataset.

I created the lmdb format dataset following your procedure and run this procedure.

bash bash/train_video_styleheat.sh

Then I got an error like this.

PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7fc8f138de50>

It seems the lmdb dataset should contain align data from this code.

Could you help me with this align data?
Maybe the script used to generate lmdb is different from the one you used?

Hi, I am trying to prepare my data according to the HDTF pre-processing for training. however the HDTF-Preprocessing link that you have mentioned in the README.md doesn't seem to exist. Would you please be able to provide an alternate link or clear instructions on how to perform the data pre-processing and LMDB format organisation required for the end-end training?

Code releasing +1

Thanks for your nice work that makes me excited. I am fascinated your project and have been doing related work recently. When do you release the code?

已创建中文的讨论组想加入的请添加微信xaaheng

Audio Driven - Video Input instead of image

Thanks for the great work; do you know how we can run audio-driven reenactment but have the input be a video instead of an image? (i.e. sync the lips in that video)

UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0.

I can't run the demo because of these two warnings that I put here. Please help me

!python inference.py
--config configs/inference.yaml
--video_source=./docs/demo/videos/RD_Radio34_003_512.mp4
--output_dir=./docs/demo/output --if_extract

model [FaceReconModel] was created loading the model from checkpoints/Deep3D/epoch_20.pth 0% 0/1 [00:00<?, ?it/s]tcmalloc: large alloc 3170893824 bytes == 0x563fa3aa8000 @ 0x7ff1f1974b6b 0x7ff1f1994379 0x7ff19feb074e 0x7ff19feb27b6 0x7ff1deb17e43 0x7ff1de49247a 0x7ff1de7ecb3a 0x7ff1de814733 0x7ff1de99aa44 0x7ff1dead75de 0x7ff1de5308a6 0x7ff1de531a60 0x7ff1de7eea39 0x7ff1de06d249 0x7ff1de987a18 0x7ff1de8937d5 0x7ff1de53334b 0x7ff1dea23708 0x7ff1de06d249 0x7ff1de987a18 0x7ff1de893925 0x7ff1de529009 0x7ff1dea2e348 0x7ff1de06d249 0x7ff1de987a18 0x7ff1de893685 0x7ff1dfef470b 0x7ff1de06d249 0x7ff1de987a18 0x7ff1de893685 0x7ff1ee5aebc6 /usr/local/lib/python3.7/site-packages/torch/nn/functional.py:3063: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) /usr/local/lib/python3.7/site-packages/torch/nn/functional.py:3385: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. warnings.warn("Default grid_sample and affine_grid behavior has changed " tcmalloc: large alloc 3170893824 bytes == 0x56412b2aa000 @ 0x7ff1f1974b6b 0x7ff1f1994379 0x7ff19feb074e 0x7ff19feb27b6 0x7ff1deb17e43 0x7ff1de49247a 0x7ff1de7ecb3a 0x7ff1de814733 0x7ff1de99aa44 0x7ff1dead75de 0x7ff1de5308a6 0x7ff1de531a60 0x7ff1de7eea39 0x7ff1de06d249 0x7ff1de987a18 0x7ff1de8937d5 0x7ff1de53334b 0x7ff1dea23708 0x7ff1de06d249 0x7ff1de987a18 0x7ff1de893925 0x7ff1dfe679c7 0x7ff1de06d249 0x7ff1de987a18 0x7ff1de893925 0x7ff1ee6af50e 0x563e4fe5f114 0x563e4fe5f231 0x563e4fec3a5d 0x563e4fe5e68b 0x563e4febefd6 ^C

Cross-Identity Reenactment

The identity of the generated video is different from that of the input image.
Is this an inherent problem？How to fix it?
Thanks

Process freeze!

Hi,

Thanks for your nice work. After I install StyleHEAT and run the command:
CUDA_VISIBLE_DEVICES=0 python inference.py --config configs/inference.yaml --video_source=./docs/demo/videos/RD_Radio34_003_512.mp4 -output_dir=./docs/demo/output --if_extract

The process freezes as shown in the following picture:

How to solve this problem?

Any plan to open audio driven model ?

Thank you for open this repo.
Please declare not open audio driven model in readme since this repo is incomplete.

ubuntu error: legacy-install-failure × Encountered error while trying to install package. ╰─> dlib

If you are facing issue like me on Dlib installation on Ubuntu please follow the steps -

sudo apt-get install build-essential cmake pkg-config
sudo apt-get install libx11-dev libatlas-base-dev
sudo apt-get install libgtk-3-dev libboost-python-dev
pip install dlib==19.22.1

ValueError: Unknown CUDA arch (8.9) or GPU not supported

how can i solve it who is know

processing time per frame?

I'm wondering what kind of processing time per frame the system typically runs at. Any info would be great. Does the system take minutes per frame or seconds? On what kind of GPU?

Performance Issue: Slow read_csv() Function with pandas Version 1.3.4 for CSV Files

Issue Description: Hello. I have discovered a performance degradation in the read_csv function of pandas version 1.3.4 when handling CSV files with a large number of columns. This problem significantly increases the loading time from just a few seconds in the previous version 1.2.5 to several minutes, almost 60x diff. I found some discussions on GitHub related to this issue, including #44106 and #44192. I found that third_part/Deep3DFaceRecon_pytorch/models/arcface_torch/eval_ijbc.py, third_part/Deep3DFaceRecon_pytorch/models/arcface_torch/onnx_ijbc.py and third_part/Deep3DFaceRecon_pytorch/models/arcface_torch/utils/plot.py both used the influenced api.

Steps to Reproduce:

I have created a small reproducible example to better illustrate this issue.

# v1.3.4
import os
import pandas
import numpy
import timeit

def generate_sample():
   if os.path.exists("test_small.csv.gz") == False:
       nb_col = 100000
       nb_row = 5
       feature_list = {'sample': ['s_' + str(i+1) for i in range(nb_row)]}
       for i in range(nb_col):
           feature_list.update({'feature_' + str(i+1): list(numpy.random.uniform(low=0, high=10, size=nb_row))})
       df = pandas.DataFrame(feature_list)
       df.to_csv("test_small.csv.gz", index=False, float_format="%.6f")

def load_csv_file():
   col_names = pandas.read_csv("test_small.csv.gz", low_memory=False, nrows=1).columns
   types_dict = {col: numpy.float32 for col in col_names}
   types_dict.update({'sample': str})
   feature_df = pandas.read_csv("test_small.csv.gz", index_col="sample", na_filter=False, dtype=types_dict, low_memory=False)
   print("loaded dataframe shape:", feature_df.shape)
generate_sample()
timeit.timeit(load_csv_file, number=1)

# results
loaded dataframe shape: (5, 100000)
120.37690759263933

# v1.3.5
import os
import pandas
import numpy
import timeit
def generate_sample():
    if os.path.exists("test_small.csv.gz") == False:
        nb_col = 100000
        nb_row = 5
        feature_list = {'sample': ['s_' + str(i+1) for i in range(nb_row)]}
        for i in range(nb_col):
            feature_list.update({'feature_' + str(i+1): list(numpy.random.uniform(low=0, high=10, size=nb_row))})
        df = pandas.DataFrame(feature_list)
        df.to_csv("test_small.csv.gz", index=False, float_format="%.6f")

def load_csv_file():
    col_names = pandas.read_csv("test_small.csv.gz", low_memory=False, nrows=1).columns
    types_dict = {col: numpy.float32 for col in col_names}
    types_dict.update({'sample': str})
    feature_df = pandas.read_csv("test_small.csv.gz", index_col="sample", na_filter=False, dtype=types_dict, low_memory=False)
    print("loaded dataframe shape:", feature_df.shape)


generate_sample()
timeit.timeit(load_csv_file, number=1)
# results
loaded dataframe shape: (5, 100000)
2.8567268839105964

Suggestion

I would recommend considering an upgrade to a different version of pandas >= 1.3.5 or exploring other solutions to optimize the performance of loading CSV files. Any other workarounds or solutions would be greatly appreciated. Thank you!

BFM path issue

Thank you for this amazing work.

I seem to be having an issue with the BFM library, even though I've unzipped the BFM.zip file and moved it to /BFM (under the root of the file system). I've also tried under the StyleHEAT folder, but the issue is the same. How do I change the path to the file in the options?

Thank you

When running the self-inference:
python inference.py
--config configs/inference.yaml
--video_source=./docs/demo/videos/RD_Radio34_003_512.mp4
--output_dir=./docs/demo/output --if_extract

I get the error:
Same-id testing
Load pre-trained e4e Encoder from checkpoints/Encoder_e4e.pth done.
Load pre-trained hfgi encoder from checkpoints/hfgi.pth done.
Load pre-trained StyleGAN2 from checkpoints/StyleGAN_e4e.pth done.
Stage: inference
Load pre-trained StyleHEAT [net_G_ema] from checkpoints/StyleHEAT_visual.pt done
----------------- Options ---------------
add_image: True
bfm_folder: /apdcephfs/share_1290939/feiiyin/TH/PIRender_bak/Deep3DFaceRecon_pytorch/BFM [default: BFM]
bfm_model: BFM_model_front.mat
camera_d: 10.0
center: 112.0
checkpoints_dir: /apdcephfs/share_1290939/feiiyin/TH/PIRender_bak/Deep3DFaceRecon_pytorch/checkpoints [default: ./checkpoints]
dataset_mode: None
ddp_port: 12355
display_per_batch: True
epoch: 20 [default: latest]
eval_batch_nums: inf
focal: 1015.0
gpu_ids: 0
img_folder: temp [default: examples]
init_path: checkpoints/init_model/resnet50-0676ba61.pth
isTrain: False [default: None]
model: facerecon
name: model_name [default: face_recon]
net_recon: resnet50
phase: test
suffix:
use_ddp: False [default: True]
use_last_fc: False
verbose: False
vis_batch_nums: 1
world_size: 1
z_far: 15.0
z_near: 5.0
----------------- End -------------------
Transfer BFM09 to BFM_model_front......
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/scipy/io/matlab/mio.py", line 39, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: '/apdcephfs/share_1290939/feiiyin/TH/PIRender_bak/Deep3DFaceRecon_pytorch/BFM/01_MorphableModel.mat'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "inference.py", line 219, in
main()
File "inference.py", line 199, in main
dataset = inference_util.build_inference_dataset(args, opt)
File "/content/StyleHEAT/utils/inference_util.py", line 195, in build_inference_dataset
model_3dmm = Extract3dmm()
File "/content/StyleHEAT/utils/video_preprocess/extract_3dmm.py", line 27, in init
self.model = create_model(opt)
File "/content/StyleHEAT/third_part/Deep3DFaceRecon_pytorch/models/init.py", line 66, in create_model
instance = model(opt)
File "/content/StyleHEAT/third_part/Deep3DFaceRecon_pytorch/models/facerecon_model.py", line 94, in init
is_train=self.isTrain, default_name=opt.bfm_model
File "/content/StyleHEAT/third_part/Deep3DFaceRecon_pytorch/models/bfm.py", line 40, in init
transferBFM09(bfm_folder)
File "/content/StyleHEAT/third_part/Deep3DFaceRecon_pytorch/util/load_mats.py", line 34, in transferBFM09
original_BFM = loadmat(osp.join(bfm_folder, '01_MorphableModel.mat'))
File "/usr/local/lib/python3.7/dist-packages/scipy/io/matlab/mio.py", line 216, in loadmat
with _open_file_context(file_name, appendmat) as f:
File "/usr/lib/python3.7/contextlib.py", line 112, in enter
return next(self.gen)
File "/usr/local/lib/python3.7/dist-packages/scipy/io/matlab/mio.py", line 19, in _open_file_context
f, opened = _open_file(file_like, appendmat, mode)
File "/usr/local/lib/python3.7/dist-packages/scipy/io/matlab/mio.py", line 45, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: '/apdcephfs/share_1290939/feiiyin/TH/PIRender_bak/Deep3DFaceRecon_pytorch/BFM/01_MorphableModel.mat'

"Killed" Error

Hi, I try to run this repository. But, I get just "Killed" text in my terminal.

Output:

Environment

GPU : RTX 3050
NVIDIA Driver : 510.108.03
Cuda : 11.6

You can find the packages in my environment from here.

Dataset preprocess

Hi, would you share the processed dataset mentioned in your paper or the preprocessing code? Thanks.

some inconsistent in code

Very appreciate your great work. I maybe found two inconsistent in code when i try to retrain the video warper, but not very uncertainly and hope to make check it.

https://github.com/FeiiYin/StyleHEAT/blob/bad7f124a74028ee4f425428388bb1e350a5119e/trainers/video_warper_trainer.py#L88
I found there is no key 'final_image' in all network forward return results.
https://github.com/FeiiYin/StyleHEAT/blob/bad7f124a74028ee4f425428388bb1e350a5119e/trainers/video_warper_trainer.py#L87
parameters is not equal to defined in VideoWarper::forward

Ram issue

I tried
python inference.py --config configs/inference.yaml --video_source=./docs/demo/videos/RD_Radio34_003_512.mp4 --image_source=./docs/demo/images/100.jpg --cross_id --output_dir=./docs/demo/output --if_extract
Got a run out of memory due to only having a 4GB GPU

Tried with a shorter video and my pc slowed down for 30 mins staying at 0% progress
What are my options apart from buying a better GPU?
Can I use CPU for instance?

It could be slowed down due to extracting the 3dmm params so as a newbie how do I run the TODO.sh?

Image preprocessing for inference on VoxCeleb dataset

Hello,

I am trying to evaluate your model on VoxCeleb dataset, however the results are poor. I have preprocessed the dataset using https://github.com/AliaksandrSiarohin/video-preprocessing and I run your inference script using --if_extract and --if-align arguments.

Is something wrong with the preprocessing of the facial images? Additionally, is your model able to handle the roll angle of the head pose?

Thank you!

Facing issue on inferrence

Hi I am trying to run the for the demo purpose of your pretrained and provided files. however upon execution of the command:

"python inference.py --config configs/inference.yaml --video_source=./docs/demo/videos/RD_Radio34_003_512.mp4 --image_source=./docs/demo/images/100.jpg --cross_id --output_dir=./docs/demo/output"

I am facing the following error.

Stage: inference
Load pre-trained StyleHEAT [net_G_ema] from checkpoints/StyleHEAT_visual.pt done
0%| | 0/1 [00:05<?, ?it/s]
Traceback (most recent call last):
File "inference.py", line 219, in
main()
File "inference.py", line 202, in main
data = dataset.load_next_video()
File "D:\rt_faceSync\styleheat\StyleHEAT\data\inference_dataset.py", line 174, in load_next_video
video_data = self.data_preprocess(video_path, image_path)
File "D:\rt_faceSync\styleheat\StyleHEAT\data\inference_dataset.py", line 118, in data_preprocess
source_3dmm = self.model_3dmm.get_3dmm([src_image_pil_256]

Worked perfectly on cross-Identity reenactment only if the single image is from ffhq datasets

And failed every images that aren't. It looks like the face parts are generated properly, yet the locations are wrong:

RD_Radio34_003_512_Obama.mp4

Deep3D checkpoint gone from checkpoints google drive folder?

I can't find it here

https://drive.google.com/drive/folders/1BtETxoWlVRoPI6vcySAzimfhemmpc4px

and inference with --if_extract then fails with the following:

FileNotFoundError: [Errno 2] No such file or directory: 'checkpoints/Deep3D/epoch_20.pth

Audio-driven animation

Thanks for releasing the code. It be very helpful you please share code/instructions to generate audio-driven animation. Running the inference code with argument enable_audio results in error as there is no argument to share audio filename.

How to train the whole framework?

Hi, thanks for your nice work.
I notice that in README, we can train the whole network by running bash bash/train_video_styleheat.sh, but I could not find this file. Also, in utils/trainer.py, a config file(maybe video_styleheat_trainer.yaml ?) is needed to get the model and the trainer. Can you release the training code or the config file for training?

About your iedas file

In this directory, your remote address is saved, and use pycharm will default import this setting

Audio warper checkpoint

Can you also release the pretrained checkpoint for audio driven model (Ex. audio_warper checkpoint) ?
I can fill in the empty codes for loading the audio and preprocessing audio and more, but without the checkpoint it's hard for me to reproduce satisfactory results.
I would be very grateful if you would kindly share the pretrained weights for audio warper !

Or could I still know if you have no plans for sharing audio warper checkpoint?

success inference

has anyone successfully performed inference in colab? If you don't mind, would you be willing to share your notebook with me? I've been experiencing quite a few errors, including the ninja issue, and many say that this error occurs due to OOM problems.

TypeError: init() takes 3 positional arguments but 7 were given

I tried python inference.py --config configs/inference.yaml --audio_path=audio/hama_8.wav --image_source=image/justin.jpeg --cross_id --if_extract --output_dir=./docs/demo/output --inversion_option=optimize for generating talking faces with driving audio.

But the terminal raises the following error:

Cross-id testing
Load pre-trained e4e Encoder from checkpoints/Encoder_e4e.pth done.
Load pre-trained hfgi encoder from checkpoints/hfgi.pth done.
Load pre-trained StyleGAN2 from checkpoints/StyleGAN_e4e.pth done.
Stage: inference
Load pre-trained StyleHEAT [net_G_ema] from checkpoints/StyleHEAT_visual.pt done
----------------- Options ---------------
                add_image: True                          
               bfm_folder: checkpoints/BFM                      [default: BFM]
                bfm_model: BFM_model_front.mat           
                 camera_d: 10.0                          
                   center: 112.0                         
          checkpoints_dir: checkpoints                          [default: ./checkpoints]
             dataset_mode: None                          
                 ddp_port: 12355                         
        display_per_batch: True                          
                    epoch: 20                                   [default: latest]
          eval_batch_nums: inf                           
                    focal: 1015.0                        
                  gpu_ids: 0                             
               img_folder: temp                                 [default: examples]
                init_path: checkpoints/init_model/resnet50-0676ba61.pth
                  isTrain: False                                [default: None]
                    model: facerecon                     
                     name: Deep3D                               [default: face_recon]
                net_recon: resnet50                      
                    phase: test                          
                   suffix:                               
                  use_ddp: False                                [default: True]
              use_last_fc: False                         
                  verbose: False                         
           vis_batch_nums: 1                             
               world_size: 1                             
                    z_far: 15.0                          
                   z_near: 5.0                           
----------------- End -------------------
model [FaceReconModel] was created
loading the model from checkpoints/Deep3D/epoch_20.pth
  0%|                                                                                                                                     | 0/1 [00:15<?, ?it/s]
Traceback (most recent call last):
  File "inference.py", line 340, in <module>
    main()
  File "inference.py", line 327, in main
    audio_reenactment(generator, data, args.audio_path)
  File "inference.py", line 112, in audio_reenactment
    wav2lip_checkpoint, device)
TypeError: __init__() takes 3 positional arguments but 7 were given

Tracking back the error messages, I realized that, in line 111 of inference.py, Audio2Coeff class is taking 6 arguments while the constructor of Audio2Coeff (line 27 in test_audio2coeff.py - path : ./third_part/SadTalker/src) is taking only 3 arguments.

Other than the inconsistent number of arguments, I don't understand why the error message says '7 were given', not '6 were given'.

�One more thing I realized is that even the constructor doesn't take 3 arguments. Actually it takes only two excluding self argument. Any explanation?

I'll be so glad if you help me with this.
Thank you for providing such a great work!

The output video of the inference does not have sound

I can hear nothing from the generated output video after inferencing as below. The video seems to be fine but only sound is missing.

python inference.py \
 --config configs/inference.yaml \
 --video_source=./docs/demo/videos/RD_Radio34_003_512.mp4 \
 --output_dir=./docs/demo/output --if_extract

Is it only me who got the issue? Or if there was anyone who got the same problem, please comment below. Thanks!

AttributeError: _2D

(StyleHEAT2) C:\Users\USER\Desktop\aria\StyleHEAT-styleheat>python inference.py --config configs/inference.yaml --video_source=./docs/demo/videos/RD_Radio34_003_512.mp4 --image_source=./docs/demo/images/100.jpg --cross_id --if_extract --output_dir=./docs/demo/output
Traceback (most recent call last):
File "inference.py", line 5, in
import utils.inference_util as inference_util
File "C:\Users\USER\Desktop\aria\StyleHEAT-styleheat\utils\inference_util.py", line 10, in
from data.inference_dataset import TempVideoDataset, ImageDataset
File "C:\Users\USER\Desktop\aria\StyleHEAT-styleheat\data\inference_dataset.py", line 5, in
from utils.video_preprocess.extract_landmark import get_landmark
File "C:\Users\USER\Desktop\aria\StyleHEAT-styleheat\utils\video_preprocess\extract_landmark.py", line 6, in
detector = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D)
File "C:\Users\USER\miniconda3\envs\StyleHEAT2\lib\enum.py", line 354, in getattr
raise AttributeError(name) from None
AttributeError: _2D

hello dears

who can solve this issue?

How to test the Audio Driven？

Generation results look stylized?

Hi,
I'm running the scripts of cross-id reenactment with provided demo images (pic 1). But the results I'm getting (pic 2) look suspiciously stylized. Is this simply expected results from styleGAN, or am I doing something wrong?

Thanks!

Error while trying inference "Cross-Identity Reenactment with a single image and a video."

I am running

python inference.py
--config configs/inference.yaml
--video_source=./docs/demo/videos/RD_Radio34_003_512.mp4
--image_source=./docs/demo/images/100.jpg
--cross_id
--output_dir=./docs/demo/output

and get the following error. Are there any ways to sole it?
Load pre-trained StyleHEAT [net_G_ema] from checkpoint/StyleHEAT_visual.pt done
0%| | 0/1 [00:05<?, ?it/s]
Traceback (most recent call last):
File "inference.py", line 219, in
main()
File "inference.py", line 202, in main
data = dataset.load_next_video()
File "/home/jupyter-terentevvs/StyleHEAT/data/inference_dataset.py", line 174, in load_next_video
video_data = self.data_preprocess(video_path, image_path)
File "/home/jupyter-terentevvs/StyleHEAT/data/inference_dataset.py", line 118, in data_preprocess
source_3dmm = self.model_3dmm.get_3dmm([src_image_pil_256], lm_np)
AttributeError: 'NoneType' object has no attribute 'get_3dmm'

A few more files to be released for training..

Nice Implementations! I wonder if you can also upload train.py, mirror_warper.py, and a few extra files for training. Thanks in advance

A question about torch version

Hello.
I follow the installation instructions as
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/torch_stable.html
to install the environment, but it occurs
AttributeError: module 'torch.jit' has no attribute '_script_if_tracing'
Please, what may be the reason?My cuda version is 11.4, which should be compatible with cu110.

can't reproduce the result

I download your file "videos.zip" and watch the results of Same-Identity Reenactment. is good

buy when I use you code like
python inference.py
--config configs/inference.yaml
--video_source=checkpoints/videos/RD_Radio34_003.mp4
--output_dir=output --if_extract

or
python inference.py
--config configs/inference.yaml
--video_source=checkpoints/videos/RD_Radio34_003.mp4
--output_dir=output --if_extract --inversion_option=optimize

it can't be same as your file "RD_Radio34_003_512.mp4"
the output resolution is 1024 and the face is not good like below.

No such file or directory: '/StyleHEAT_result/train_video_styleheat/epoch_00010_iteration_000023000_checkpoint.pt'

D:\anaconda\python.exe D:/work/vgan/StyleHEAT-main/inference.py
Same-id testing
Load pre-trained e4e Encoder from checkpoints/Encoder_e4e.pth done.
Load pre-trained hfgi encoder from checkpoints/hfgi.pth done.
Traceback (most recent call last):
File "D:/work/vgan/StyleHEAT-main/inference.py", line 223, in
main()
File "D:/work/vgan/StyleHEAT-main/inference.py", line 202, in main
generator = StyleHEAT(opt.model, PRETRAINED_MODELS_PATH).cuda()
File "D:\work\vgan\StyleHEAT-main\models\styleheat\styleheat.py", line 36, in init
self.load_checkpoint(opt, path_dic)
File "D:\work\vgan\StyleHEAT-main\models\styleheat\styleheat.py", line 59, in load_checkpoint
ckpt = torch.load(path, map_location='cpu')['net_G_ema']
File "D:\anaconda\lib\site-packages\torch\serialization.py", line 699, in load
with _open_file_like(f, 'rb') as opened_file:
File "D:\anaconda\lib\site-packages\torch\serialization.py", line 231, in _open_file_like
return _open_file(name_or_buffer, mode)
File "D:\anaconda\lib\site-packages\torch\serialization.py", line 212, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/StyleHEAT_result/train_video_styleheat/epoch_00010_iteration_000023000_checkpoint.pt'
Load pre-trained StyleGAN2 from checkpoints/StyleGAN_e4e.pth done.
Stage: inference

Doubt on data/audio_dataset.py

I guess the class of AudioDataset would used in Audio Driven Motion Generation. Proxy input image is used and make pairs of proxy input and target image according the paper, but there is no hint to find proxy input image in lmdb. So do the proxy input image generate in training? Does the image source_align found in AudioDataset used in Audio driven training, or what is this kind image's useful here? Look forward to your reply.
https://github.com/FeiiYin/StyleHEAT/blob/bad7f124a74028ee4f425428388bb1e350a5119e/data/audio_dataset.py#L184

Can I train from scratch or at least fine-tune audio-driven pipeline?

How much GPU memory is required at least?

Thank you for your excellent work.
Does anyone know how much GPU memory is required at least?
My Nvidia RTX 8GB，can't work......CUDA out of memory............sad

Cross-Identity Reenactment results not so good?

WRA_KevinMcCarthy0_000_40.mp4

我们创建了一个中文讨论组，有需要的加我微信douzijun1999

1705126444.mp4

about requirements.txt

Why is requirements.txt so much and so messy, and what does '@ file...' mean?

Can you release a simple and effective version？

numpy.core._exceptions.MemoryError: Unable to allocate 2.95 GiB for an array with shape (252, 1024, 1024, 3) and data type float32

D:\ProgramData\Anaconda3\python.exe "E:/work/StyleHEAT-main (2)/StyleHEAT-main/inference - test.py"
Same-id testing
Load pre-trained e4e Encoder from checkpoints/Encoder_e4e.pth done.
Load pre-trained hfgi encoder from checkpoints/hfgi.pth done.
Load pre-trained StyleGAN2 from checkpoints/StyleGAN_e4e.pth done.
Stage: inference
Load pre-trained StyleHEAT [net_G_ema] from checkpoints/StyleHEAT_visual.pt done
0%| | 0/1 [00:00<?, ?it/s]D:\ProgramData\Anaconda3\lib\site-packages\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at ..\c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
D:\ProgramData\Anaconda3\lib\site-packages\torch\nn\functional.py:3613: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode)
D:\ProgramData\Anaconda3\lib\site-packages\torch\nn\functional.py:3982: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
"Default grid_sample and affine_grid behavior has changed "
Traceback (most recent call last):
File "E:/work/StyleHEAT-main (2)/StyleHEAT-main/inference - test.py", line 224, in
main()
File "E:/work/StyleHEAT-main (2)/StyleHEAT-main/inference - test.py", line 212, in main
reenactment(generator, data)
File "E:/work/StyleHEAT-main (2)/StyleHEAT-main/inference - test.py", line 69, in reenactment
video_util.write2video("{}/{}".format(args.output_dir, data['video_name']), fake_images)
File "E:\work\StyleHEAT-main (2)\StyleHEAT-main\utils\video_util.py", line 21, in write2video
video_numpy = (np.transpose(video_numpy, (0, 2, 3, 1)) + 1) / 2.0 * 255.0
numpy.core._exceptions.MemoryError: Unable to allocate 2.95 GiB for an array with shape (252, 1024, 1024, 3) and data type float32

can't run demo

The error log is like this when running the
python inference.py --config configs/inference.yaml --video_source=./docs/demo/videos/RD_Radio34_003_512.mp4 --image_source=./docs/demo/images/100.jpg --cross_id --output_dir=./docs/demo/output

Load pre-trained e4e Encoder from checkpoints/Encoder_e4e.pth done.
Load pre-trained hfgi encoder from checkpoints/hfgi.pth done.
Load pre-trained StyleGAN2 from checkpoints/StyleGAN_e4e.pth done.
Stage: inference
Load pre-trained StyleHEAT [net_G_ema] from checkpoints/StyleHEAT_visual.pt done
  0%|                                                                                             | 0/1 [00:07<?, ?it/s]
Traceback (most recent call last):
  File "inference.py", line 219, in <module>
    main()
  File "inference.py", line 202, in main
    data = dataset.load_next_video()
  File "/home/usr1/project/StyleHEAT/data/inference_dataset.py", line 174, in load_next_video
    video_data = self.data_preprocess(video_path, image_path)
  File "/home/usr1/project/StyleHEAT/data/inference_dataset.py", line 118, in data_preprocess
    source_3dmm = self.model_3dmm.get_3dmm([src_image_pil_256], lm_np)
AttributeError: 'NoneType' object has no attribute 'get_3dmm'

Error: ninja: build stopped: subcommand failed.

Hi, I'm getting error when i run inference.py file. The error occurs when the following line runs inside fusec_act.py
fused = load(
'fused',
sources=[
os.path.join(module_path, 'fused_bias_act.cpp'),
os.path.join(module_path, 'fused_bias_act_kernel.cu'),
],
)

I suspect that this is an issue with gcc. Can you please share the build that was used by you guys when you ran this?

For test audio driven, how to load driven audio?

After open enable_audio, how to merge driven audio into dataset ?

3DMM features

Hi. You have mentioned for 3DMM feature extraction the Deep3DFaceRecon_pytorch is used.
After using this repo, the returned results are 0000.mat and 0000.obj.
How can we extract 3DMM from them?

opentalker / styleheat Goto Github PK

styleheat's People

Contributors

Stargazers

Watchers

Forkers

styleheat's Issues

Output:

Environment

Recommend Projects

Recommend Topics

Recommend Org