Practical Course: Applied Foundation Models in Computer Vision (10 ECTS)
News
The preliminary meeting will take place on Feb 7th, 2024 at 2pm under the following link:
https://tum-conf.zoom-x.de/j/63369197429?pwd=aVFseGlBRkd5Sloxc0YyUWlOSlp5dz09
The course itself will be conducted in person.
Organizational
Organizers:
Email: afm24@cvai.cit.tum.de
Slides for preliminary meeting: Will be uploaded soon.
Number of participants: 12-18
Compute: We aim to provide every team with access to 12GB Titan GPUs on our internal SLURM cluster. This should be enough since we the course does not require extensive retraining of models.
Application
Send an email with your current MSc. transcript to our email address by Feb 14th.
The subject of the email has to be: [AFM] Your Name
Course Description
In recent years, foundation models, i.e., models that are trained on broad datasets and can be used for different applications, have transformed computer vision and natural language processing. In this practical course, we will first get an overview of different foundation models via student presentations and then explore the applications of such models.
We envision the practical part of this course similar to a hackathon, where the goal is to build an interesting application using the given foundation model. Example projects from this course are as follows.
- Diffusion Models: Adapt large pre-trained diffusion models like stable diffusion to a special use case using ControlNets.
- Depth Prediction: Build a simple VR application using general-purpose depth prediction networks and a camera tracker.
- etc.
Course Layout
Students will work in teams of 3 on a given topic, which will be assigned through a preference-based matching at the start of the semester.
There will be three block sessions in which attendance is mandatory.
- Kick-Off Session (Date: TBD): Instructors will present the different topics and explain the course organization.
- First Presentation Session (Date: TBD): Student teams will present the theoretical ideas behind the foundation model they are working on. For this, students will have to review recent publications.
- Second Presentation Session (Date: TBD): Student teams will present their practical work at the end of the semester.
Besides these mandatory meetings, teams will arrange individual supervision sessions with their respective supervisor on a regular basis (Rough guideline: 30mins meeting every two weeks). Besides that, supervisors will be available via chat.
Grading
The final grade will be determined through a weighted average of the both presentations, the project and the final report.
Prerequisites
- Introduction to Deep Learning or Machine Learning or Computer Vision 3
- Practical experience with: Pytorch, HuggingFace