January 14, 2015January 15, 2015 by dumoulinv

Lecture 3, Jan. 15, 2015

What will be covered:

Solution to the numpy + MNIST + MLP assignment
Git primer
Theano primer
Porting numpy + MNIST + MLP to Theano

Slides can be found here. You can bring your laptop to follow along if you’d like to.

If you haven’t already, it would be a very good idea to create a Github account for the lecture.

Update

A student pointed out that the batch version of the outer product used in the solution could be implemented using a dot product:

V_grad = (H[:, :, None] * (Y - T)[:, None, :]).mean(axis=0)

becomes

V_grad = numpy.dot(H.T, Y - T) / H.shape[0]

for instance. Not only is it simpler to write and understand, but it also greatly speeds up computation. The assignment solution has been updated accordingly.

Thanks for the intervention!

6 Replies to “Lecture 3, Jan. 15, 2015”

kastnerkyle says:

January 15, 2015 at 5:10 pm

Enumerating a little more on my comment from class:

One workflow (of many possible with git) is to treat one GitHub master branch as the “golden copy” of the code. By doing this, it means that collaboration between people is usually easier, at the expense of having to coordinate changes a little more (which is usually a good thing!). This kind of philosophy is described here in the sklearn documentation: http://scikit-learn.org/stable/developers/index.html#contributing

For scikit-learn, this means that https://github.com/scikit-learn/scikit-learn is the golden copy. My own version, at https://github.com/kastnerkyle/scikit-learn , will have my changes that I wish to save, but don’t expect to typically share with other people. The local version on my own machine has scratch and temporary work that I would be OK with completely losing in the case of computer issues.

By *never* working in the master branch of your code (in any repo!), it means it is always possible to pull new changes from another master into your local master. It is then much easier to move those changes into a feature you are developing. This requires a few things:

All work is put into the main GitHub master branch via pull request! GitHub makes this pretty easy, and there are plenty of examples online.

Work is done in branches by starting from the master branch on your own machine, then doing
‘git checkout -b my_special_feature’

When the work is done, you push this work in your new ‘my_special_feature’ branch to your personal GitHub repo with
‘git push origin my_special_feature’

Now you can go to GitHub and create a pull request (PR) that can be merged into the “golden copy” of the code. If you recently pushed it, there will probably be a big green button to create a pull request – nice and simple.

Someone with the correct permissions will click a button (if there are no issues) that will then merge your changes from ‘my_special_feature’ into the primary repository!

Once your changes are merged, your other collaborators can get your changes by pulling the master branch from the golden repo with something like this, if they are in their own master branch:
‘git pull upstream master’

Other hints:
You pretty much always want ‘git pull –rebase’. The reasons for this are a little complicated, but if you are using the workflow mentioned above it can save a lot of headaches.

Keep your changes as small as possible. It is very difficult to merge 100 commits in 15 differnt files if there are conflicts, but 1 or 2 commits in 1 file is pretty manageable. Try to make changes in small bites.

You almost never want ‘git push -f’. Using ‘git push -f’ on work that is shared between you and anyone else can create massive headaches for your teammates – not good!

I would be glad to give a demo or help anyone who has trouble – git is a very important tool, and understanding it can be a huge productivity boost as well as a major stress reliever and Vincent mentioned in class. Using it correctly means *never* worrying about losing code, ever. Working on massive projects becomes much easier, and many companies also use git for their internal version control. git started because the Linux kernel needed it (which is probably the largest collaborative code base ever) and they use git every day to collaborate and make changes for code that is critical to computing worldwide. Kind of awesome 🙂

LikeLiked by 5 people

Reply
adbrebs says:

January 18, 2015 at 12:18 am

It is also possible to get a 20% speedup by using temoprary variables:
temp1 = (Y – T) / H.shape[0]
V_grad = numpy.dot(H.T, temp1)
d_grad = temp1.sum(axis=0)

temp2 = numpy.dot(temp1, V.T) * H * (1 – H) * (H.shape[0] / X.shape[0])
W_grad = numpy.dot(X.T, temp2)
b_grad = temp2.sum(axis=0)

Small typo in the latest slide of the solution pdf. Given the definition of the weight matrices, the gradients with respect to the weights should be transposed.

LikeLiked by 1 person

Reply
1. dumoulinv says:
  
  January 19, 2015 at 12:13 am
  
  Are you referring to the last page of the PDF found here: https://raw.githubusercontent.com/vdumoulin/ift6266h15/master/assignments/01/solution.pdf?
  
  You’re right, derivatives with respect to V and W should be transposed. I updated the file. Thanks for noticing!
  
  LikeLiked by 1 person
  
  Reply
2. dumoulinv says:
  
  January 19, 2015 at 12:27 am
  
  You’re also right concerning the speedup obtainable by using temporary variables.
  
  This is one of the things for which Theano really shines: since it’s working in terms of a computational graph, it’s able to detect that some intermediary results are used more than once and automatically merge the corresponding nodes in the graph.
  
  LikeLike
  
  Reply
Alexandre de Brébisson says:

January 18, 2015 at 12:20 am

It is also possible to get a 20% speedup by using temporary variables:
temp1 = (Y – T) / H.shape[0]
V_grad = numpy.dot(H.T, temp1)
d_grad = temp1.sum(axis=0)

temp2 = numpy.dot(temp1, V.T) * H * (1 – H) * (H.shape[0] / X.shape[0])
W_grad = numpy.dot(X.T, temp2)
b_grad = temp2.sum(axis=0)

Small typo in the latest slide of the solution pdf. Given the definition of the weight matrices, the gradients with respect to the weights should be transposed.

LikeLike

Reply
jz says:

January 19, 2015 at 7:32 pm

I’m not sure I agree with the suggestion to rebase rather than merge. It might make it look like the history is cleaner to the uninitiated, but if you’re cooperating with someone from the same repo, consider the following scenario:

You push commit A into the repo; they pull A from the repo and start working on it; they commit C; you push B into the repo; they push the C commit into the repo; you attempt to push D into the repo.

If they rebase, the history looks like A – B before the operation and looks like A – C – B’ after. Your push would attempt to apply changes to B on top of B’ rather than making changes to B and merging those changes with those engendered by C. This can be hell to fix, especially in longer cases.

Merging pull requests actually performs a merge rather than a rebase, which is why working in separate repos and making pull requests would work even if you were to rebase while working alone. Still, I would consider using rebase to be a bad habbit at best.

Also, I would suggest using `git add -i` to mark changes as it allows the user to review the modifications since the last commit and selectively accept or deny changes they made.

LikeLike

Reply