March 23, 2015April 5, 2015 by aaroncourville

Lecture 20, March 26th, 2015: The Variational Autoencoder

In this lecture we will discuss Variational Autoencoders.

Please study the following material in preparation for class:

Auto-Encoding Variational Bayes by Diederik P Kingma, and Max Welling
Slides from class lecture.

Other relevant material:

Semi-supervised Learning with Deep Generative Models by Diederik P. Kingma, Danilo J. Rezende, Shakir Mohamed, Max Welling
Stochastic Backpropagation and Approximate Inference in Deep Generative Models by Danilo Jimenez Rezende, Shakir Mohamed, Daan Wierstra

7 Replies to “Lecture 20, March 26th, 2015: The Variational Autoencoder”

Kyle Kastner says:

March 24, 2015 at 2:01 pm

Can we go over the semi-supervised VAE M2 model in some detail? It seems you pay a penalty (section 3.3) compared to the number of classes. Could you get around this in practice if you wanted samples from every class?

It would also be good to talk about formulations of the sigma for stability: exp(log_sigma) vs. softplus vs. others. It is hard to see what the differences between softplus(sigma) vs parameterizing log_sigma directly are.

LikeLike

Reply
julianserban says:

March 24, 2015 at 4:44 pm

In the VAE paper, they compared the VAE to Monte Carlo EM (a variant from 1987!) in figure 3. There it clearly seems that Monte Carlo EM performs better than VAE for a small number of training examples, and (judging from the increasing blue lines in the right plot) perhaps also better in the case of many training examples. However, it is well known that online EM usually works even better than batch EM for large datasets… So, has the VAE ever been compared to other online variational EM algorithms?

Aaron, I guess you plan to do this anyway, but it would be nice to dicuss the shortcomings of VAEs as well. The VAE paper doesn’t really touch on that.

LikeLiked by 1 person

Reply
julianserban says:

March 25, 2015 at 1:52 am

The DeepMind paper (Rezende et al.) seems to obtain worse samples on MNIST than Kingma et al. in “Auto-Encoding Variational Bayes”. I can’t quite tell if it’s because they use the binarized version of MNIST, or if it’s because each layer of the deep latent Gaussian model includes Gaussian noise. Could it be the latter case, since their NORB samples are quite blurry?

LikeLike

Reply
serdyukdv says:

March 25, 2015 at 11:26 pm

In section 2.1 of Kingma & Welling’s paper is mentioned, that they “do not make the common simplifying assumptions about the marginal or posterior”. Which assumptions do they mean?

LikeLike

Reply
Mohammad says:

March 26, 2015 at 1:46 am

It would be great if we go over section 2.2 (Variational Bound) in details.

LikeLike

Reply
daniel e-b says:

March 26, 2015 at 3:49 am

Could you give us an insight on how to choose g(.) when q(z|x) doesn’t follow any of the approaches listed in section 2.4? Also, in practical, how is q(z|x) chosen for a case less obvious than a gaussian?

LikeLike

Reply
Pingback: Generating Large Images From Latent Vectors | 神刀安全网