Introduction to Computer Vision 2018/2019
Mathieu Aubry, Karteek Alahari, Ivan Laptev, and Josef Sivic
Class time: Thursday 9:00 - 12:00
Teaching Assistant: Robin Champenois, bonjour [at] robin-champenois [point] fr . Robin is your main contact for anything related to the programming assignments and final projects.
There will be four/five programming assignments representing 60% of the grade. The supporting materials for the programming assignments projects will be in Python. You have to send your assignments to Robin by the deadline.
The final project will represent 40% of the grade. Each project is based on a paper and a list of suggested papers is available here. Feel free to ask for papers on a topic that you are interested in or propose a paper (in this case, it has to be validated before the November 8th)
You are expected to understand and present the paper, but also to offer some added value, such as experiments of your own, new interesting tests with available code, or comparison with other relevant works. This will have to be adapted depending on the paper. You will have to present your project (10 minutes + questions) and return a summary (2 pages max) of the essential points that should be readable (and useful) for the other students in the class.
You can discuss the assignments and final projects with other students in the class. Discussions are encouraged and are an essential component of the academic environment. However, each student has to work out their assignment alone (including any coding, experiments or derivations) and submit their own report. The assignments and final projects will be checked to contain original material. Any uncredited reuse of material (text, code, results) will be considered as plagiarism and will result in zero points for the assignment / final project. If a plagiarism is detected, the student will be reported to ENS.
Topic and reading materials.
Introduction, overview, image formation, digital photography
Low level Computer Vision, image correspondences and grouping
Linear and non-Linear Image filtering: Fourier and convolution, Bilateral Filter, Non-Local-Mean
Ressources: relevant Book chapters on Fourier and linear image filtering (chapt. 2 and 3); Detailed presentation of Bilateral Filter ,
Edges (Canny), Segmentation (K-means, GMM, Mean shift), Points (Harris Corners, blob detection)
Instance recognition, Feature detectors and descriptors, SIFT, Visual search
Markov Random Fields: optimization methods (graph-cuts, belief propagation, TRW-S), applications to stereo and segmentation
Assignment 1 Due (Canny edges)
Human color perception, color in computer vision
Human 3D vision/perception
Projective geometry, camera matrix
refs : Forsyth and Ponce "Geometric camera model" chapter
Szeliski chapter 2 "Image formation" + printed introduction
camera calibration, Stereo vision
ref: Linear algebra for Vision (from Fei Fei Li): slides, pdf
Assignment 2 Due (Mean Shift clustering)
Project choice due
Introduction to category-level recognition / Introduction to CNNs
Assignment 3 Due (camera calibration)
Document on Stochastic Gradient Descent (Guillaume Obozinski)
analyzing CNNs, CNNs for object detection and semantic segmentation
Optical Flow: optical flow equation, Lukas-Kanade, Horn and Schunk, SIFT-flow, large displacement and deep optical flow.
Using synthetic data/3D shape analysis with CNNs
Low-level video analysis: tracking, human segmentation and pose
Assignment 4 Due (MNIST recognition with NN)
High level video analysis, action recognition
Overview of other/advanced topics
Intro to Computer Graphics/rendering
Project reports due
Final projects presentation
Final projects presentation
Optional Assignment Due (Optical flow)
D.A. Forsyth and J. Ponce, "Computer Vision: A Modern Approach", Prentice-Hall, 2nd edition, 2011
J. Ponce, M. Hebert, C. Schmid and A. Zisserman "Toward Category-Level Object Recognition", Lecture Notes in Computer Science 4170, Springer-Verlag, 2007
O. Faugeras, Q.T. Luong, and T. Papadopoulo, "Geometry of Multiple Images", MIT Press, 2001.
R. Hartley and A. Zisserman, "Multiple View Geometry in Computer Vision", Cambridge University Press, 2004.
J. Koenderink, "Solid Shape", MIT Press, 1990
R. Szeliski, "Computer Vision: Algorithms and Applications", 2010. Online book.
Good and relevant lectures by other people (many slides are taken from them)
James Hays https://www.cc.gatech.edu/~hays/compvision/
Svetlana Lazebnik http://slazebni.cs.illinois.edu/spring18/
Derek Hoeim https://courses.engr.illinois.edu/cs543/sp2017/