I have read several papers about diffusion models in the context of deep learning. especially this one
As explained in the paper, by learning the score function (∇log(????????(????)))
,probability flow ode trajectory can uniquely map any data point in the data distribution (X(0), let’s say an image) to a point in multivariate gaussian distribution (Prior or X(T)) of the same dimension, as shown in Figure 2 in the paper (attached here as well).
I have several questions in that regard:
-
We know that we can draw a random a sample from the gaussian distribution and the reverse SDE would generate sample from the data distribution, in this case a realistic image. Is that also true for the reverse ODE?
-
Can we say ODE trajectory is a bijective mapping from the data distribution to the gaussian distribution?
-
Can we say in such gaussian distribution every element (dimension) is independent of the others? The motivation for this question is, I want to find a mapping or a new representation of my data, in which every element is independent of the others (some sort of disentangled representation, but interpretability doesn’t matter). Can such mapping replace the ICA algorithm?