Clarifications for Jan. 22’s lecture

Theano port

As requested by a student today, the solution to last week’s assignment was updated to include the Theano port that was made in class.

Regarding MethodCost

The MethodCost subclass of Cost has a constructor which accepts a string representing the method to call on the model to get a cost expression.

A coupling between the Cost subclass and the Model subclass was mentioned today: the Cost subclass is responsible to know what method of the Model subclass in order to produce a cost expression from the inputs. This method can either

  1. produce a prediction, in which case the Cost subclass needs to combine this prediction with the targets to form the cost expression (this was the case for the MLP implementation presented today), or
  2. produce the cost expression directly from the inputs and the targets, in which case the Cost subclass only acts as an intermediary.

This second option is how MethodCost is implemented: when its expr method is called, it calls whatever method name was passed as argument to its constructor on the Model subclass by giving it the inputs and the targets and returns the returned value as the cost expression.

In the case of the MLP class (the one in Pylearn2, not the small implementation presented today), the cost_from_X method takes inputs and targets and produces a cost expression based on its output layer, which is why the cost was specified as !obj:pylearn2.costs.cost.MethodCost {method: ‘cost_from_X’} in the example YAML file.


One Reply to “Clarifications for Jan. 22’s lecture”

  1. Hi,

    I’m posting my question under this lecture because we have talk this morning about
    Convolutional Neural Nets in pyLearn2. I’ve been working on it and i have some issues with the kernels matrices.

    In Hugo Laroche’s tutorials (, he exposes the following example:
    The input is an image 83*83, and in layer 1, there are 64 features maps and 75*75 hidden units by feature maps.
    Let Kij be the kernel matrix connecting the i th input channel with the j th feature map and let Xi be the matrix of the i th receptive field. We do the convolution between the matrices Kij and Xi.
    My question is:
    The matrix Kij is the same in the j th feature map, so why do we differentiate the matrix K2j and K3j for example, why is there an index i ? i am missing something ^^

    Thanks for your help, Pierre.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s