Introduction to Computer Vision 2019/2020

Mathieu Aubry, Karteek Alahari, Ivan Laptev, and Josef Sivic


Course Information

Class time: Tuesdays 9:00 - 12:00

Room: UV

News

Teaching Assistant: Robin Champenois, bonjour [at] robin-champenois [point] fr . Robin is your main contact for anything related to the programming assignments and final projects.

Pages from previous years: 2017-2018, 2018-2019

Evaluation
Assignments

There will be four programming assignments representing 60% of the grade. The supporting materials for the programming assignments projects will be in Python. You have to send your assignments to Robin by the deadline.

Final project

The final project will represent 40% of the grade. There can be two types of projects:

Projects are in group of 2 or 3. You will have to present your project (~10 minutes x number of persons in the group + questions) and return a summary (2 pages max) of the essential points that should be readable (and useful) for the other students in the class.

Collaboration policy

You can discuss the assignments and final projects with other students in the class. Discussions are encouraged and are an essential component of the academic environment. However, each student has to work out their assignment alone (including any coding, experiments or derivations) and submit their own report. If you borrowed some code from somebody else, or wrote some code with somebody else, write it clearly in the report. Any uncredited reuse of material (text, code, results) will be considered as plagiarism and will result in zero points for the assignment. You all answer questions in very different ways, plagiarism is obvious when correcting assignments. If plagiarism is detected, the student will be reported to ENS.

Course schedule (subject to change):

Lecture

Date

Instructor

Topic and reading materials.

Slides

Syllabus

1

Sept. 24th

MA

Introduction, overview, projections and camera matrix

refs : Forsyth and Ponce "Geometric camera model" chapter

Szeliski chapter 2 "Image formation"

Linear algebra for Vision (from Fei Fei Li): slides, pdf

PDF (updated)

PDF

2

Oct. 1st

MA

Digital photography, color, Human 3D  vision/perception

Edges (Canny)

Linear and non-Linear Image filtering: Fourier and convolution, Bilateral Filter

Ressources: relevant Book chapters on Fourier and linear image filtering (chapt. 2 and 3); Detailed presentation of Bilateral Filter ,

UPDATED: TP on Canny Edges (html , ipynb, lena.jpg, tools.jpg)

WARNING: beware of image visualization in jupyter notebooks (especially for your edges), they can be misleading: to be sure, save the images.

PDF 

PDF

3

Oct. 8th

MA

Bilateral Filter, Non-Local-Mean

Segmentation (K-means, GMM, Mean shift)

Optical Flow: optical flow equation, Lukas-Kanade, Horn and Schunk.

TP2 on mean-shift segmentation (html , ipynb)

PDF 

PDF

4

Oct. 15th

MA

Optical flow: summary and large displacement optical flow.

Projective geometry, Stereo vision, camera calibration.

printed introduction

Assignment 1 Due (Canny edges)

WARNING: beware of image visualization in jupyter notebooks (especially for your edges), they can be misleading: to be sure, save the images.

PDF 

PDF

5

Oct. 22nd

KA

Markov Random Fields: optimization methods (graph-cuts, belief propagation, TRW-S), applications to stereo and segmentation

PDF

6

Nov. 5th

MA

Camera calibration, multi-view reconstruction.

TP3 on camera calibration (html , ipynb), due November 19th

Assignment 2 Due (Mean Shift clustering)

PDF

PDF

7

Nov. 12th

JS

Classical feature detectors (Harris Corners, blob detection) and descriptors, local features

Instance recognition, Visual search

PDF

8

Nov. 19th

MA

Introduction to category-level recognition / Introduction to CNNs

Project choice finalized

Assignment 3 Due (camera calibration)

PDF

9

Nov 26th

MA

Training CNNs, analyzing CNNs

TP4 on NN (html , ipynb)

Document on Stochastic Gradient Descent (Guillaume Obozinski)

PDF

10

Dec. 3rd

MA

CNNs for object detection and semantic segmentation, 3D shape analysis with NNs

PDF

11

Dec. 10th

MA

Intro to Computer Graphics, deep 3D shape generation, classical shape analysis

Assignment 4 Due (MNIST recognition with NN)

PDF

12

Dec 17th

KA

Low-level video analysis: tracking, human segmentation and pose

13

Jan. 7th

IL

High level video analysis, action recognition

14

Jan. 14th

MA

Projects presentations

15

Jan. 21th

MA

Projects presentations

Vincent Letpetit: 6D pose estimation and augmented reality

Relevant literature:

[1]

D.A. Forsyth and J. Ponce, "Computer Vision: A Modern Approach", Prentice-Hall, 2nd edition, 2011

[2]

J. Ponce, M. Hebert, C. Schmid and A. Zisserman "Toward Category-Level Object Recognition", Lecture Notes in Computer Science 4170, Springer-Verlag, 2007

[3]

O. Faugeras, Q.T. Luong, and T. Papadopoulo, "Geometry of Multiple Images", MIT Press, 2001.

[4]

R. Hartley and A. Zisserman, "Multiple View Geometry in Computer Vision", Cambridge University Press, 2004.

[5]

J. Koenderink, "Solid Shape", MIT Press, 1990

[6]

R. Szeliski, "Computer Vision: Algorithms and Applications", 2010. Online book.

[7]

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. Online book

Good and relevant lectures by other people (many slides are taken from them)

James Hays https://www.cc.gatech.edu/~hays/compvision/

Svetlana Lazebnik http://slazebni.cs.illinois.edu/spring18/ 

Derek Hoeim https://courses.engr.illinois.edu/cs543/sp2017/