The consistency_models from cloneofsimo

Schedules depend on epoch num

Hello. Why do N(k), m(k) depend on the number of epoch, not on the number of step?

[Discussion] How to understand Consistency Training (CT) in isolation

Thanks for the brilliant work! I am reading this legendary paper and get this question that I want to discuss here.

The paper starts at introducing a new method to distill knowledge from a trained score-based model. By Eq. (7) one could easily learn that the function $\boldsymbol f_\theta$ maps two different points to the same result. These two points inherently lie on the same ODE trajectory, as ${\hat{\mathbf x} {t_n}}^{\phi}$ comes from $\mathbf x_{t_{n+1}}$ by a step of score-based ODE. In this way, the function learns to map all the points on an ODE trajectory to the same point, so it can be used to generate data by one direct step from the initial point (Gaussian noise). So far this makes perfect sense to me.

By then the paper starts to introduce "training consistency models in isolation", and the training objective remains almost the same except that the two points become $\mathbf x+{t_{n+1}}\mathbf z$ and $\mathbf x+t_n\mathbf z$. These two points obviously cannot lie in the same ODE trajectory. Otherwise, the ODE trajectory becomes a straight line as they are using the same $\mathbf z$. Eq. (9) states the relationship between loss values of Consistency Distillation and Consistency Training, but this comparison is based on the expectation of all data $\mathbf x$. If my understanding is correct, then when we are using Eq. (10) to train a generative model, the function $\boldsymbol f$ is not intended to learn as the same as in consistency distillation, i.e. map all points in an ODE trajectory to its starting point. If so, what is Eq. (10) really doing and is Fig. 2 still valid in this sense?

I know this is not directly related to this implementation but I'm looking forward to any hint!

Is this model reversible? (Image to Noise)

Hello! Your work is truly remarkable. I've had the opportunity to experiment with standard DDPM models using DDIM sampling, and despite being relatively slow, they exhibit a fascinating reversibility property. This allows for seamless transitions between Gaussian noise and images, as well as the reverse process, recreating the exact same input noise.

I would be grateful if you could provide some insights into whether this reversibility property is preserved in Consistency models. The examples provided do not explicitly demonstrate this aspect, and I am eager to learn more about it. Thank you for your time and expertise.

cloneofsimo / consistency_models Goto Github PK

consistency_models's People

Stargazers

Watchers

Forkers

consistency_models's Issues

Schedules depend on epoch num

[Discussion] How to understand Consistency Training (CT) in isolation

Is this model reversible? (Image to Noise)

Will this have a license?

guys, how about the performance on video, anyone wants to try?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent