Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix small nits in HW2 #18

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions homeworks/hw2/hw2.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -465,15 +465,15 @@
"\n",
"Implement a hierarchical VAE that follows the following structure.\n",
"* $z1$ is a 2x2x12 latent vector where p(z1) is the unit Gaussian.\n",
" * Learn the approximate posterior $q_\\theta(z|x) = N(z; \\mu_\\theta(x), \\Sigma_\\theta(x))$, where $\\mu_\\theta(x)$ is the mean vector, and $\\Sigma_\\theta(x)$ is a diagonal covariance matrix. I.e., same as a normal VAE, but use a matrix latent rather than a vector. Each dimension is independent.\n",
" * Learn the approximate posterior $q_\\theta(z1|x) = N(z1; \\mu_\\theta(x), \\Sigma_\\theta(x))$, where $\\mu_\\theta(x)$ is the mean vector, and $\\Sigma_\\theta(x)$ is a diagonal covariance matrix. I.e., same as a normal VAE, but use a matrix latent rather than a vector. Each dimension is independent.\n",
"* $z2$ is a 2x2x12 latent vector.\n",
" * $p_\\theta(z2|z1)$ is learned, and implemented as a neural network that parameterizes mean (and log std, optionally).\n",
" * $q_\\theta(z2|z1,x)$ is also learned. Implement this as a Residual Normal [see NVAE] over the prior $p_\\theta(z2|z1)$.\n",
"* The decoder should be a function of $z2$ only.\n",
"\n",
"Some helpful hints:\n",
"* Two KL losses should be calculated. The first should match $q_\\theta(z|x)$ to the unit Gaussian. The second should match $q_\\theta(z2|z1,x)$ and $p_\\theta(z2|z1)$, and be taken with respect to $q$.\n",
"* When calculating the second KL term, utilize the analytic form for the residual normal. When $q_\\theta(z2|z1,x) = N(z2; \\mu_\\theta(z1) + \\Delta \\mu_\\theta(z1,x), \\Sigma_\\theta(z1)) * \\Delta \\Sigma_\\theta(z1,x))$, use the following form: `kl_z2 = -z2_residual_logstd - 0.5 + (torch.exp(2 * z2_residual_logstd) + z2_residual_mu ** 2) * 0.5`\n",
"* When calculating the second KL term, utilize the analytic form for the residual normal. When $q_\\theta(z2|z1,x) = N(z2; \\mu_\\theta(z1) + \\Delta \\mu_\\theta(z1,x), \\Sigma_\\theta(z1) * \\Delta \\Sigma_\\theta(z1,x))$, use the following form: `kl_z2 = -z2_residual_logstd - 0.5 + (torch.exp(2 * z2_residual_logstd) + z2_residual_mu ** 2) * 0.5`\n",
"* When calculating KL, remember to sum over the dimensions of the latent variable before taking the mean over batch.\n",
"* For the prior $p_\\theta(z2|z1)$, fix standard deviation to be 1. Learn only the mean. This will help with stability in training.\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion homeworks/hw2/hw2_latex/main.tex
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@

\newpage

Final VQ-VAE Test Loss: \textcolor{red}{FILL}, PixelCNN Prior Test Los: \textcolor{red}{FILL} (Dataset 2)
Final VQ-VAE Test Loss: \textcolor{red}{FILL}, Transformer Prior Test Los: \textcolor{red}{FILL} (Dataset 2)
\begin{figure}[H]
\centering
\begin{subfigure}[b]{0.475\textwidth}
Expand Down