An important part of the learning experience associated with this course (and 40% of the grade) comes from experimenting with the algorithms presented in class. This page describes what is expected from the students. Feel free to ask questions below.
- Get hands-on experience with some of the algorithms presented in the course.
- Practice the writing of an experimental journal (e.g. on a blog dedicated to your experiments for this course), describing your ideas, experimental plans, experimental results, and discussions of potential conclusions (i.e., the stuff that eventually ends up in papers).
- Practice the use of collaborative tools for writing code, using a repository dedicated to your experimental work (e.g., with Github).
- Practice the collaborative competition typically enjoyed by scientists:
- Work of each student (in the code repository and in the blog) is available to the others to build upon, thus speeding up the overall rate of progress of the group.
- Each student is encouraged to re-use the ideas, results, tricks, and code from other students but MUST properly cite and acknowledge these inputs (plagiarism without citation will be severely punished).
- Each student competes to obtain good results on common benchmarks, but can take advantage of the good ideas of the others, hence the collaborative competition.
- Maintain a blog detailing ideas, experiments and progress. Some periods of the semester may be more active than others, but an average of one blog post per week is the minimum required.
- Make your code publicly available through a Github repository.
Part of the grade will come from having been the first to do something useful and publicize it on your blog (possibly posting here announcements with links to the blog). The more this contribution is useful to advancing each other’s progress, the more points will be given. This should provide an incentive to do things quickly that may otherwise look boring but that could be useful to others.
This year’s class project will revolve around the Dogs vs. Cats Kaggle challenge, which was held from September 25, 2013 to February 1st, 2014.
The task is straightforward: given an image, determine if it contains a cat or a dog.
There are two stages to the project.
In the first stage, you are asked to build a classifier which obtains at least 80% accuracy on the test set. This constitutes a minimal requirement for the class project.
As a reference, here’s a paper detailing what was considered state-of-the-art before the Kaggle contest was held. At the time, state-of-the-art was slightly above 80%.
Contrary to most of the top entries in the contest, you won’t be allowed to use external data to train a feature extractor (e.g. training a network on ImageNet and using the trained network to extract features for the Dogs vs. Cats challenge is not allowed).
Two reasons justify this decision:
- Allowing this may put a high strain on lab resources, as many people training a large model on ImageNet at the same time is likely to require lots of GPUs.
- The project aims at familiarizing students with deep learning techniques, and using existing feature extractors such as OverFeat goes against this goal. Lots of insights which are gained by attempting to train model like OverFeat are lost if it is used only as a black box feature extractor.
In the second stage, you are asked to make an improvement over existing models, either with respect to memory footprint or computation time.
You’ll have to work together to establish baselines and try new ideas to improve upon those baselines.
Good results will obviously be rewarded but are not strictly required. Novel ideas, especially if they’re plausibly justified, will also be rewarded if an honest attempt is made at trying them out.
Blogs and repos
Please start by creating your blog and your code depository. A list of pointers to these will be maintained below.
Some examples from previous years
- Laurent Dinh
- Vincent Dumoulin