March 20, 2015March 21, 2015 by aaroncourville

Lecture 19, March 23rd, 2015: Deep Boltzmann Machines

In this lecture we will discuss Deep Boltzmann Machines.

Please study the following material in preparation for the class:

Deep Boltzmann Machines. by Ruslan Salakhutdinov and Geoffrey Hinton
Lecture slides (now linked)

6 Replies to “Lecture 19, March 23rd, 2015: Deep Boltzmann Machines”

Andrew Doyle says:

March 22, 2015 at 5:29 pm

I don’t quite understand the training procedure presented in the paper. When pre-training each layer of the DBM, twice as many parameters are computed than in the final model. How are those twice as many parameters re-combined to produce only one copy of the layer? Does the Gibbs sampler just discard the modified layer and preserve the other one as a sample?

LikeLiked by 1 person

Reply
1. julianserban says:
  
  March 22, 2015 at 6:18 pm
  
  I am also confused about this point. Maybe you can go over it in detail Aaron?
  
  LikeLike
  
  Reply
iamkelvinxu says:

March 22, 2015 at 11:53 pm

People who talk about this paper often mention the fact that the actual implementation uses an array of hacks/tricks that cannot be inferred from the paper. (Ian Goodfellow mentioned this in his PhD defense).

I’d be nice if someone could go over these tricks.

Kelvin

LikeLiked by 1 person

Reply
Christian Dansereau says:

March 23, 2015 at 5:06 pm

In the Paper Figure.4 they show examples of samples obtained from the 2 hidden layers and the 3 hidden layers DBM, should we see a difference in quality of the numbers (they look pretty good in the 2 models…)?
if this is right what’s the advantage/interpretation? do we only need more then 2 layer for more complex representation in MNIST the 2 layer would be a good estimator of the distribution.

LikeLike

Reply
Samira Shabanian says:

March 23, 2015 at 5:17 pm

The energy function of the Deep Boltzmann Machine is a kind of challenging part for me. How did people come up with that energy functions?

LikeLike

Reply
1. Jose Sotelo says:
  
  March 23, 2015 at 6:41 pm
  
  My intuition is that it follows the Boltzmann Machine’s energy function (Eq 1. in the paper) but filling with zeros the weights corresponding to nodes that are not connected.
  
  LikeLike
  
  Reply