Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small fixes to the "Diffusion autoencoders" post #186

Merged
merged 3 commits into from
Oct 10, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions collections/_posts/2024-10-07-Diffusion_Autoencoders.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ $$ \mathbf{x}_{t+1} = \sqrt{\alpha_{t+1}} f_\theta(\mathbf{x}_t, t, z_{\text{sem
Due to the conditioning of the decoder on $$ z_{\text{sem}} $$, diffusion autoencoders no longer function as generative models.
To address this, the authors introduced a mechanism for sampling $$ z_{\text{sem}} \in \mathbb{R}^{d} $$ from the latent distribution.

They choosed to fit another DDIM (called latent DDIM):
They chose to fit another DDIM (called latent DDIM):
$$ p_{\omega}(z_{\text{sem}, t-1} | z_{\text{sem}, t}) $$

to the latent distribution of $$ z_{\text{sem}} = \text{Enc}_{\phi}(x_0), \quad x_0 \sim p(x_0) $$
Expand Down Expand Up @@ -180,7 +180,7 @@ These directions are found thanks to a linear classifier.
<div style="text-align:center"><img src="/collections/images/DiffusionAutoencoders/Fig5.jpg" width=1500></div>


## Qualitative results
## Quantitative results

Evaluation of the reconstruction quality:

Expand All @@ -192,14 +192,14 @@ They also evaluate the effects of varying the dimension of $$ z_{\text{sem}} $$:
<div style="text-align:center"><img src="/collections/images/DiffusionAutoencoders/Tab2.jpg" width=1500></div>


# Conlusion
# Conclusion

In conclusion, this paper demonstrates the potential of leveraging DPMs for representation learning, aiming to extract meaningful and decodable representations of input images through an autoencoding framework.
In conclusion, this paper demonstrates the potential of leveraging DPMs for representation learning, aiming to extract meaningful and decodable representations of input images through an autoencoder framework.


# References

[1] [Song, J., Meng, C., Ermon, S. (2020). Denoising diffusion implicit models. arXiv](https://arxiv.org/pdf/2010.02502)
[1] [Song, J., Meng, C., Ermon, S. (ICLR 2021). Denoising diffusion implicit models.](https://arxiv.org/pdf/2010.02502)

[2] [Dhariwal, P., & Nichol, A. (2021). Diffusion models beat gans on image synthesis. Advances in neural information processing systems](https://proceedings.neurips.cc/paper_files/paper/2021/file/49ad23d1ec9fa4bd8d77d02681df5cfa-Paper.pdf)
[2] [Dhariwal, P., & Nichol, A. (NeurIPS 2021). Diffusion models beat GANs on image synthesis.](https://proceedings.neurips.cc/paper_files/paper/2021/file/49ad23d1ec9fa4bd8d77d02681df5cfa-Paper.pdf)

Loading