Computer Vision

Graduate course, Institute of Artificial Intelligence Innovation, NYCU, 2024

Course Overview and Objectives:

This course aims to provide students with a deep understanding of the fundamental concepts and techniques of computer vision, including image formation, feature extraction, 3D reconstruction, image segmentation, object recognition, deep learning, object detection, object tracking, and facial recognition. It also aims to equip students with the implementation and application of algorithms, models, and frameworks related to computer vision, enabling them to address computer vision problems across various domains such as autonomous driving, smart homes, medical image analysis, and more. Through this course, students will comprehend the limitations and challenges of current computer vision applications and explore future development directions. By undertaking this course, students will acquire the following abilities:

  • Fundamental knowledge and skills in computer vision and image processing.
  • Sensitivity and analytical skills toward emerging technologies and trends.
  • Capability to conduct independent research and development, possessing teamwork and project management abilities.
  • Innovative thinking and problem-solving skills, with the capacity to apply learned knowledge to promote technological innovation and societal progress.

Prerequisites

  • Proficiency in Python
    • All class assignments will be in Python. If you have a lot of programming experience but in a different language (e.g. C++/Matlab/Javascript) you will probably be fine.
  • College Calculus, Linear Algebra
    • You should be comfortable taking derivatives and understanding matrix vector operations and notation.
  • Basic Probability and Statistics
    • You should know basics of probabilities, gaussian distributions, mean, standard deviation, etc.

Grading

  • Assignments (50%): Including programming assignments, homeworks, etc.
  • Final Project (40%): Including midterm project proposal, midterm report, final project presentation, and the final project report. Each group is restricted to a maximum of 3 members. Students are required to select a computer vision-related topic, develop a proposal, and undertake an implementation project. The evaluation criteria encompass a thorough comprehension of the problem, innovation, practicality of the solution, completeness, and performance of the technical implementation. Additional credit will be awarded for incorporating a demo and technical implementation.
  • Class Participation (10%): -1 each absent.

Office Hours

  • Monday 11:00-12:00 am
  • Room: Engineering Building-6 (374)

Progress

WeekDateProgress, Content, TopicsSlidesHomeworkExtra Info
12/20IntroductionLec0, Lec1  
22/27Image FormattionLec2HW1: Image Sensing Pipeline 
33/5Image FeaturesLec3HW2: Harris Corner Detection 
43/12Camera CalibrationLec4HW3: Camera CalibrationGroup Form Due
53/19Stereo VisionLec5HW4: Homography Transformation 
63/26Image SegmentationLec6HW5: Otsu Thresholding and Gaussian Adaptive Thresholding 
74/2Image ClassificationLec7HW6: NN/CNN for ImageNet 
84/9Regularization and OptimizationLec8  
94/16Project Proposal at Midterm  Project Proposal Due
104/23Recurrent Neural Networks/LSTMLec9HW7: RNN/LSTM for IMDB 
114/30Attention and TransformersLec10HW8: GCNet/SENET for ImageNet 
125/7Generative AILec11HW9: Diffusion Model 
135/14VLM and LMMLec12HW10: LLM Application 
145/213D VisionLec13  
155/28CV Application in IndustryLec14  
166/4Final Project Presentation  Final Report Due

Resource

Textbook