Comments (5)
Hi,
there even exists a recent arxiv paper that has succeeded to reproduce the values of PVDM: https://arxiv.org/abs/2402.13729v1
If you face difficulties to reproduce the values, you should ask authors before making it public..
from latte.
Just a quick check:
- Did you use ema weights?
- Did you use 2,048 samples for evaluation?
- Did you use 400 steps (DDPM sampler) for generating samples?
Because with this setup, I got many mails that the results can be reproduced with the checkpoints.
from latte.
Hi, I am the first author of PVDM, and I just checked the FVD values of PVDM are much worse than the values that I reported in the paper. Could you tell me why such differences exist?
Many people tried (and succeeded) to reproduce the values, so it is weird to me.
Hello, thanks for your interest of our work. For UCF101 and Skytimelapse datasets, we followed the paper and used provided pre-trained checkpoints for evaluation. However, we were unable to reproduce the reported results. It would be very helpful if you could provide any details or a complete evaluation code on how to obtain the results.
from latte.
Can you please reply to this issue?
from latte.
Can you please reply to this issue?
Thank you very much for sharing the info and sorry for missing your question. We will thoroughly review the results with the information you've provided. We would greatly appreciate any additional details you could offer regarding the reimplementation of your work. Many thanks again.
from latte.
Related Issues (20)
- What is the difference between Latte and ViViT? HOT 2
- RuntimeError: "LayerNormKernelImpl" not implemented for 'Half' HOT 1
- RuntimeError: "slow_conv2d_cpu" not implemented for 'Half' HOT 2
- image_size = [256,512] HOT 4
- CUDA out of memory HOT 4
- Evaluate the FVD? HOT 5
- Some weights of AutoencoderKL were not initialized from the model checkpoint at /path/to/Latte/t2v_required_models/ and are newly initialized because the shapes did not match: HOT 2
- FaceForensics数据集 HOT 3
- No positional embeddings in LatteT2V?
- Is autoregression possible? HOT 3
- Questions about the *0.18215 and /0.18215 operation HOT 2
- About video VAE HOT 2
- 有关时长的问题。 HOT 4
- question on t2v model training HOT 2
- 视频帧率 HOT 2
- Question: model code and design choices HOT 2
- how to place and preprocess these datasets HOT 7
- the code of variant 4 HOT 1
- Question: evaluate the FVD HOT 6
- Error once speed up training HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from latte.