In this lecture we will discuss Deep Boltzmann Machines.
Please study the following material in preparation for the class:
- Deep Boltzmann Machines. by Ruslan Salakhutdinov and Geoffrey Hinton
- Lecture slides (now linked)
IFT6266 – H2015 Representation Learning
A mostly deep learning course
In this lecture we will discuss Deep Boltzmann Machines.
Please study the following material in preparation for the class:
I don’t quite understand the training procedure presented in the paper. When pre-training each layer of the DBM, twice as many parameters are computed than in the final model. How are those twice as many parameters re-combined to produce only one copy of the layer? Does the Gibbs sampler just discard the modified layer and preserve the other one as a sample?
LikeLiked by 1 person
I am also confused about this point. Maybe you can go over it in detail Aaron?
LikeLike
People who talk about this paper often mention the fact that the actual implementation uses an array of hacks/tricks that cannot be inferred from the paper. (Ian Goodfellow mentioned this in his PhD defense).
I’d be nice if someone could go over these tricks.
Kelvin
LikeLiked by 1 person
In the Paper Figure.4 they show examples of samples obtained from the 2 hidden layers and the 3 hidden layers DBM, should we see a difference in quality of the numbers (they look pretty good in the 2 models…)?
if this is right what’s the advantage/interpretation? do we only need more then 2 layer for more complex representation in MNIST the 2 layer would be a good estimator of the distribution.
LikeLike
The energy function of the Deep Boltzmann Machine is a kind of challenging part for me. How did people come up with that energy functions?
LikeLike
My intuition is that it follows the Boltzmann Machine’s energy function (Eq 1. in the paper) but filling with zeros the weights corresponding to nodes that are not connected.
LikeLike