In this lecture we will continue our discuss of unsupervised learning methods. We will study **sparse coding** and the **manifold** interpretation of autoencoders.

**Please study the following material in preparation for the class:**

- Lecture 8 (8.1 to 8.9) of Hugo Larochelle’s course on Neural Networks.
- Chapter 13 of the Deep Learning Textbook.

**Other relevant material:**

- “Sparse coding with an overcomplete basis set: A strategy employed by V1?” Bruno A. Olshausen
^{ }, David J. Field

Advertisements

What is the connection between ZCA and sparse coding?

LikeLiked by 1 person

Can you give an intuition why projected gradient descent algorithm works? It seems that even in a convex case the algorithm may push D towards shrinking, not rotation and stuck.

LikeLiked by 1 person

I am also interested in this question!

LikeLike

How does one apply a sparse coding model during testing time or when uses the code in a classification algorithm? It seems to be a costly operation to infer the code every time. Especially if we would like to use random patches of the image as the input to the classifier and cannot precompute sparse codes.

LikeLike

Using Hugo Larochelle’s notation, I get what h(x(t)) is in sparse coding, but I don’t understand what h(t) represents. Can anyone explain?

LikeLike

It’s probably a notation error …

LikeLike

I think h(x(t)) is the h(t) that minimizes the sparsity equation!

LikeLike

I have some difficulty understanding the actual benefit of using sparse coding compared to an autoencoder with regulation for images, all these techniques seams to identify some king of edge detector… could you comment on this.

LikeLiked by 1 person

In section 13.1 of the book at the end of page 269 it is mentioned that

“1. Learning a representation h of training examples x such that x can be approximately recovered from h through a decoder. Note that this needs not be true for any x, only for those that are probable under the data generating distribution. “

What does “probable under the data generating distribution” mean? Is it that a single auto-encoder can’t learn the representations of different data coming from submanifolds that are “far appart”? What happens if for instance two images are visually alike but issued from two very distant manifolds? Is this possible?

LikeLike