Giter Club home page Giter Club logo

face-vid2vid's People

Contributors

mingyuliutw avatar tcwang0509 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

face-vid2vid's Issues

some problem about the paper

Your project is vary awesome! I am trying to play your methor by myself, but I have some comfusions. Can you give me some suggestions?
In one words, I'm not sure the output shape of some Modules. Follow your describe, and given a input Tensor with shape 1x3x256x256. I got the fs with shape 1x32x16x64x64, the output of last UpBlock3D of L is 1x32x512x256x256(that cost so much in calculate Jc,k).

1.Appearance feature ectractor F: this is simple, but I want to make sure if the output(that is called fs) will get a shape 1x32x16x64x64 (The shape of input is 1x3x256x256). The fs after warping will be feed into the Motion field estimator M, but there are 5 DownBlock3D, the D with 16 just need 4 downblocks will become 1, why should we need 5 donwblocks?
2. The path of occlusion in Motion field estimator M: Why that will have a Reshape C137*D16->C2192, the output of the last UpBlock3D will have 32 channels, how much about the D? 137x16/32=68.5, and I think the D should be 16 just as the same of fs.
3. The path of mask in Motion field estimator M: there is a 7x7x7-Conv-21, k is 20, why C is 21. And is it need a global pooling? The mask is a 20-d number? Just multiple to the every pixel of Wk?
4. I want to make sure the operation of 3D block such as UpBlock3D, will it double D just like the opration to H and W?

Source Code

Hi,
Nice work and congratulations on your paper. Do you plan to open the code in the near future?

Open source code

Congratulations on your article being accepted as CVPR Oral! Do you have any plans to open the source code recently?

Some technique questions

Hi, congratulations on your article being accepted as CVPR 2021 oral. Looking forward to your open source code as soon as possible. It is a great and fantastic work. When reading your paper and looking through your website page (face-vid2vid), there are some questions I would like to ask.

(1) In your paper and website, the background of face videos is still and only the face is moving. Therefore, I would like to know whether the current technique cannot handle the scene where the face and its background are both in motion or even in huge motion.

(2) In addition, for this video reconstruction task, the face of your demo videos has different angles and relatively dramatical motion, but your algorithm still can achieve good performance and successfully generate video. It is so amazing. So I would like to ask you how to solve the problem of face with huge motion.

(3) I also want to ask you if some tricks are applied to solve temporal inconsistency in the resulting video. The demo video results you have provided in terms of comparisons with motion transfer methods show that your proposed algorithm has better temporal consistency than fs-vid2vid and FOMM. It is difficult for human eyes to perceive flickering artifact in generating videos when using face-vid2vid algorithm. On the contrary, the resulting videos using fs-vid2vid and FOMM are commonly perceived with obvious flickering artifacts and poorer rendering face results.

Sincerely hope that these questions could be answered by you. Thank you very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.