elliottwu / unsup3d Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 195.0 2.33 MB

(CVPR'20 Oral) Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

License: MIT License

Python 97.88% Shell 2.12%

3d 3d-objects 3d-reconstuction cvpr cvpr-2020 cvpr2020 pytorch unsupervised unsupervised-learning

unsup3d's People

Contributors

Stargazers

Watchers

Forkers

zeta1999 davtalab jjandnn rogalag antonlinderer ideaplexus ak9250 battani peterzhousz nguyenducnhaty liuguoyou connor323 pipigenius samleoqh xiangnanhe khurrampirov yuxwind ccj5351 jiasong18 asetsuna zdstandup havefunbb easy-shu stevenyesz zebrajack hiker2046 yxiao2 wuxiaolianggit jizhongpeng dafuny mikedai2020 liu515 bruinxiong jzwpm shuangliu1992 codeaudit aloysmose huolala2020 2020-best-paper baldrlector shuizhizhiyin peterzs siyuanliii sober-orange shuangli59 niu-niu johnbhlm luoyin500 lidongyv killsking yonglinz qqyouhappy zhangguanghui1 youtang1993 leng123ku shayanjoya hadryan fanwangm verigle gormonn freedreamer-crypto ericlong423 torreto666 zhiwen123 zhangzheng0131 cosmoshua knifeedge xiangliu886 spinphysic ryf1123 lewj85 mombin cv-ip mrkoujan xrosliang damehou awesome-cv zhaofuq lanyueqiang multipath pxinli huizhangyang tp227 whutdyp xiaoqi12112 shuxiangguo thetkim9 ecustcvteam chnxindong zhuguangqiang chikayan star428 mathpopo louisfinner alphalfc gouxiayibu andy12392 5l1v3r1 openseg-group z7zuqer

unsup3d's Issues

GPU memory and training time

Thanks for your great work. And I what to know how many GPU memory it need during training phase and testing phase resppectively? And how long does it take to train this network?

Training with front-view car images results not good.

Hi,
Thanks for the paper. I appreciate the awesome work. We are students trying to re-train the Unsup3D model on front view car images. However, the results of the training were not good. We trained with 100 epochs and the validation loss in final epoch is 0.70303. We are definitely sure that we are missing something.

Can you please guide/advise on tweaking any parameters to construct good 3D model for front-view car images ?

Thanks

replace neural mesh renderer with pytorch3d

Hi, wu!

I‘m sorry to disturb you again. Inspired from your repo, I just replace the neural renderer with pytorch3d. Specially, I choose MeshRenderer in pytorch3d.

I first eliminated the inconsistency between NMR and PyTorch3D coordinate systems by overriding PerspectiveCameras class in pytorch3d. However, I found that the noise for shape is so big that the network can not learn a reasonable face shape. Note that I have trained the model with 100 epochs. The figure below is a comparison between NMR and PyTorch3D.

NMR:

PyTorch3D:

RasterizationSettings is showed as below:

     # hard rasterization
     raster_settings = RasterizationSettings(
            image_size=self.image_size,
            blur_radius=0.0,
            faces_per_pixel=1,
            bin_size=0
        )

I noticed that you mentioned one can add a smoothing loss to the depth map for alleviating the noisy depth problem in #9. So can you provide the specific form of such a smoothing loss? I really don't have a clue about this.

Thank you very much!

ValueError: The histogram is empty, please file a bug report.

how to solve this problem?

Depth prediction - pose changes

hello, I am trying to predict the depth directly from a face image, however the final depth is always predicted for different camera pose. How can I retain the camera pose and face angle, that I have in my initial input image? Thank you

Face detection method used in celeba_crop_bbox.txt

Thank you very much for your contribution. What kind of face detection method do you use in celeba_crop_bbox.txt?

More Complex Light Model

Hi authors,

Thanks for your great work.

If I want a more complex light model, for example, currently we have only one global light direction, what if I want three lights?
If we model three lights, will the output be better? And which part of the code should I update?

Thanks in advance.
Ruixin

Using my own data set, this model performs poorly

Train a model based on synface dataset and result is not good as paper

Hi! Thank you very much for your excellent work!

I use the provided script python run.py --config experiments/train_celeba.yml --gpu 0 --num_workers 4 to train a model for the synface dataset.

And then I use python run.py --config experiments/test_celeba.yml --gpu 0 --num_workers 4 to test the model.

Finally, I got 0.0092±0.002 SIDE and 17.77±1.92 MAD, which is not good as in the paper(0.793 ±0.140 and 16.51 ±1.56 MAD in Table 2).

May I have a problem with my operation?

Thank you!

Is there anyone has some problems when unzip those dataset?

I just cant decompress the dataset "img_celeba.7z" in ubuntu 18.04. can anyone help?

Some confusion about demo.

Hi,I run the demo and test with my own data,but it can not perform very well, so i am confused there is a requirement for the input picture.

How to test unsup3d on Google Colab?

Hi, when I tried to test unsup3d on Google Colab, I had some problems, such as:

/usr/local/bin/python: Error while finding module specification for 'demo.demo' (AttributeError: module 'demo' has no attribute 'path')

so, how to write a Colab notebook can successfully test unsup3d?
just like: https://blog.csdn.net/yrwang_xd/article/details/103150691

Thanks!

How to train a car

Hello, thank you very much for being open source.
I have a problem that I only see about face and cat in your code, but your paper has about car. So I want to ask you how to train cars?
Thank you very much!
@elliottwu

Render is too slow in 512×512 resolution

Hi! Thank you very much for such a great work!

I'm trying to train a network in the size of 512×512. However, the time for a single forward propagation (batch size = 8) is too long (~60.12s). Furthermore, I find that the time is mainly spent on rendering the reconstructed depth map (~59.98s). I want to know what causes the rendering to be so slow.

Again thank you for your work and code, looking forward to your reply :>.

Evaluation on 3DFAW

Hi Wu,
Thanks for releasing your impressive work. Recently I am trying to reproduce the results of Table 5 and have got the 3DFAW dataset. I wonder know how you perform the data processing and do the evaluation. I guess that you refer to the depth-net repo https://github.com/joelmoniz/DepthNets/tree/master/depthnet-pytorch, but it is still confusing that how to crop the image, preserve the key points location and calculate the metrics. Is it possible for you to release this part of code or show more details?

My train model using train_celeba.yml

Hi!
I train a model using trian_celeba.yml provided by you without other changes, only change the batch size from 64 to 32.(because my memory has only 8G)
Now I get the trained model after 30 epoches. I run demo.py and get results using my trained model and your model respectively.
But the two results are not exactly the same(eg. the eyes). I don't know why and hope get some advices from you.
(The right is my trained model.)

After a period of training, the result will collapse

I used celeba dataset and webface dataset, and then used your code and settings for training. After training the first epoch, the results are as follows:
Render image：

Source image：

But at the second epoch, the rendered image becomes extremely poor.

It seems that a large part of the face has been filtered out

Segmentation fault (core dumped) when rendering

I set up the environment needed for training according to the Readme, but found that there was a core dump problem during rendering. Has anyone encountered this problem?
Thank you!

syntax error or programming trick?

class Metrics():
    def __init__(self):
        self.iteration_time = MovingAverage(inertia=0.9)
        self.now = time.time()

    def update(self, prediction=None, ground_truth=None):
        self.iteration_time.update(time.time() - self.now)
        self.now = time.time()

    def get_data_dict(self):
        return {"objective" : self.objective.get(), "iteration_time" : self.iteration_time.get()}

The return statement has syntax error or it's a programming trick?

About masked face

Hi Wu!
Your work is really good! But I wonder how can I get a masked face reconstruction result?
If I train a model using dataset (https://github.com/cabani/MaskedFace-Net), will it works?
Thanks a lot!

Excuse me, am I missing something? The CUDA version is 10.2, and this error is always reported during training

Forbidden Error for Syncar dataset.

Hello,
Thank you for sharing your amazing work. We are students trying to replicate this on the car dataset. Unfortunately, it seems like there are some permission issues to download the data.

curl -o syncar.zip "https://www.robots.ox.ac.uk/~vgg/research/unsup3d/data/syncar.zip"

does not work because of a 403 error ( Permission denied ).
Would it be possible to grant access for the same as the others seems to be working fine ?

generalized to the clothing data set

Hi!
I want to generalize this method to my cloth dataset generated by GAN.
like this (mostly symmetry, about 1400)

I try to train a model, but the outputs are very horibble. I am at a loss. Could you do me a favor, give me some suggestions.
Thanks!

请问调参过程困难吗？是否有在网络设计当中的某些小部件出现拉低效果的情况

如题

Higher input resolution?

Hi, the input face size is 64*64 which is too small, have you tried higher resolution for better image quality?

Questions about image size

Hi, @elliottwu , Sorry to bother you again, but I have 2 questions about the setting of image size:

Will increasing the input image_size improve the reconstruction effect? Since I have another dataset trained on the unsup3d model, but I didn't get satisfactory recon results, so I wonder if increasing the input image_size will heal the problem;
I have tried increasing the image_size of the input image, set image_size in data_loader as 128 (2 times as original image_size=64), but I encountered the following error:

RuntimeError: The size of tensor a (128) must match the size of tensor b (4224) at non-singleton dimension 0

After checking, I found the two tensors are canon_normal and canon_light_d.view(-1,1,1,3) in the forward process, a element-wise multiplication will be operated on them, but they are unequal on the first dimension respectively:

torch.Size([128, 128, 128, 3])
torch.Size([4224, 1, 1, 3])

So I wonder if you have encountered this kind of error, and how you solved it. Thank you very much, and looking forward to your response.

Code availability update

Hi! I would love to try this out and see what I can adapt it into. When do you target releasing the code?

replace neural renderer with softRas

Hi, wu!
congratulations to you for unsup3d has been elected as the CVPR2020 best paper. Inspired from your repo, I just replace the neural renderer with pytorch3d point cloud renderer.
My repo is: https://github.com/tomguluson92/unsup3D_pytorch3d
But I found it is inferior to your repo, but as far as I know that SoftRas inside pytorch3d is a more powerful differentiable renderer. Do you have time in using pytorch3d and find out the difference between them?
Thanks a lot!

Train the model with RGBD dataset

Thanks for your great work and congratulations! Since the unsup3d model is trained based on RGB input, what if I have RGBD human face dataset capture by commodity RGBD cameras, in which way should I add the supervision for the depth part to make full use of the depth information input?

Actually I have tried some unsupervised method such as Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set, which estimate the BFM parameters directly and use a differential renderer (my choice is Pytorch3D) for end-to-end weakly-supervised training. I tried to use L1/L2 loss between the rendered zbuffer and real depth map, but in this way the depth loss may conflit with other loss components(rgb loss, landmark loss).

Any sugguestions on this?

Thanks

Can this algorithm generate 3d model with different num. of Vertices and faces?

Can this algorithm generate 3d model with different num of Vertices and faces? and how to config?
As you know, 3d Face models will be used in different usage scenarios. Thanks

Download Pretrained Models

Congratulations!
I am so interested in your project. When i was running the .sh file to download the pretrained models, i always get the network problem. It is so nice of you if you could send your all pretrained models to my e-mail box. Best wishes to you! Have a nice day! Here is my address : [email protected].

Any plan to release the car dataset?

Hi Elliott,

Really nice work!

Do you plan to release the car dataset you use in the paper for further research? Also, how did you choose the light parameters when rendering the images? Thank you very much.

pip install neural_renderer_pytorch

@elliottwu
I'm using a Windows environment, RTX 3080
when using the install neural_renderer_pytorch , an error is displayed:
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\nvcc.exe' failed with exit status 2
----------------------------------------
ERROR: Command errored out with exit status 1: 'D:\Anaconda3\envs\unsup\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\ADMINI1\AppData\Local\Temp\pip-install-s57_gk9g\neural-renderer-pytorch_baa3a
7b911c743b6a25d2f949292e84f\setup.py'"'"'; file='"'"'C:\Users\ADMINI1\AppData\Local\Temp\pip-install-s57_gk9g\neural-renderer-pytorch_baa3a7b911c743b6a25d2f949292e84f\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(_file
_) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\ADMINI~1\Ap
pData\Local\Temp\pip-record-zg4vslmq\install-record.txt' --single-version-externally-managed --compile --install-headers 'D:\Anaconda3\envs\unsup\Include\neural-renderer-pytorch' Check the logs for full command output.

Is the version of the graphics driver too high？

About the viewpoint

Questions about evaluation metrics

Congrats on your best paper award and thank you for your generous open source.

Now that I have been through the training process, and obtain a model trained for 70 epochs. After I run the test code of the model on the test set, I got a series of directories consisting of images used for 3D reconstruction. However, I did not find the file to output the evaluation metrics like "scale-invariant depth error (SIDE)" or "mean angle deviation (MAD)" as mentioned in the paper.

So what I wonder is how to output or where I can find these evaluation metrics to measure the performance of my trained model, and if any ground truth data needed for this evaluation process.

Looking forward to your reply and help, much thanks.

How to customize a higher resolution model?

How to acquire the 3D shape without texture like the following image?

Output form of the model

Hi, thx for your work, very impressive!

I got a query about the output form of the mode. To my knowledge, the output (reconstruction) and the canonical view are all 2D images, but with depth value which could be used to reconstruct 3D volumes. Is that right? Or canonical view (reconstruction image) is a 3D volume already?

Again thank you for your work and code, looking forward to your reply :>.

what's the difference and relation between "view_after" and "yaw_rotations" when rendering?

Congratulation to be this best paper!
The 3d reconstruction results is really impressive with only single view image used!
I'm new to 3d cv task, when I run the demo.py, I'm confused with the "view_after" and "yaw_rotations".
I think since the yaw_rotations has rotate the view, why do we need the view_after here

Much lower loss but wrong output

Hi authors,

Thanks for your great work.

I did an experiment:

I trained a model successfully, with loss around 0.6, and the output were meaningful.
I did gamma correction for all the input images.
I trained the model again, got a much lower loss only aound 0.2, but the output were wrong.