Computer Vision
Graduate course, Institute of Artificial Intelligence Innovation, NYCU, 2024
Course Overview and Objectives:
This course aims to provide students with a deep understanding of the fundamental concepts and techniques of computer vision, including image formation, feature extraction, 3D reconstruction, image segmentation, object recognition, deep learning, object detection, object tracking, and facial recognition. It also aims to equip students with the implementation and application of algorithms, models, and frameworks related to computer vision, enabling them to address computer vision problems across various domains such as autonomous driving, smart homes, medical image analysis, and more. Through this course, students will comprehend the limitations and challenges of current computer vision applications and explore future development directions. By undertaking this course, students will acquire the following abilities:
- Fundamental knowledge and skills in computer vision and image processing.
- Sensitivity and analytical skills toward emerging technologies and trends.
- Capability to conduct independent research and development, possessing teamwork and project management abilities.
- Innovative thinking and problem-solving skills, with the capacity to apply learned knowledge to promote technological innovation and societal progress.
Prerequisites
- Proficiency in Python
- All class assignments will be in Python. If you have a lot of programming experience but in a different language (e.g. C++/Matlab/Javascript) you will probably be fine.
- College Calculus, Linear Algebra
- You should be comfortable taking derivatives and understanding matrix vector operations and notation.
- Basic Probability and Statistics
- You should know basics of probabilities, gaussian distributions, mean, standard deviation, etc.
Grading
- Assignments (50%): Including programming assignments, homeworks, etc.
- Final Project (40%): Including midterm project proposal, midterm report, final project presentation, and the final project report. Each group is restricted to a maximum of 3 members. Students are required to select a computer vision-related topic, develop a proposal, and undertake an implementation project. The evaluation criteria encompass a thorough comprehension of the problem, innovation, practicality of the solution, completeness, and performance of the technical implementation. Additional credit will be awarded for incorporating a demo and technical implementation.
- Class Participation (10%): -1 each absent.
Office Hours
- Monday 11:00-12:00 am
- Room: Engineering Building-6 (374)
Progress
Week | Date | Progress, Content, Topics | Slides | Homework | Extra Info |
---|---|---|---|---|---|
1 | 2/20 | Introduction | Lec0, Lec1 | ||
2 | 2/27 | Image Formattion | Lec2 | HW1: Image Sensing Pipeline | |
3 | 3/5 | Image Features | Lec3 | HW2: Harris Corner Detection | |
4 | 3/12 | Camera Calibration | Lec4 | HW3: Camera Calibration | Group Form Due |
5 | 3/19 | Stereo Vision | Lec5 | HW4: Homography Transformation | |
6 | 3/26 | Image Segmentation | Lec6 | HW5: Otsu Thresholding and Gaussian Adaptive Thresholding | |
7 | 4/2 | Image Classification | Lec7 | HW6: NN/CNN for ImageNet | |
8 | 4/9 | Regularization and Optimization | Lec8 | ||
9 | 4/16 | Project Proposal at Midterm | Project Proposal Due | ||
10 | 4/23 | Recurrent Neural Networks/LSTM | Lec9 | HW7: RNN/LSTM for IMDB | |
11 | 4/30 | Attention and Transformers | Lec10 | HW8: GCNet/SENET for ImageNet | |
12 | 5/7 | Generative AI | Lec11 | HW9: Diffusion Model | |
13 | 5/14 | VLM and LMM | Lec12 | HW10: LLM Application | |
14 | 5/21 | 3D Vision | Lec13 | ||
15 | 5/28 | CV Application in Industry | Lec14 | ||
16 | 6/4 | Final Project Presentation | Final Report Due |