Practical Course: Vision-based Navigation IN2106 (6h SWS / 10 ECTS)
SS 2024, TU München
Lecturers: Sergei Solonets, Daniil Synitsin
Please direct questions to visnav-ss24@vision.in.tum.de
News
No session on 08.05!!!
The pre-meeting will be held on Zoom 07.02 at 14:00 https://tum-conf.zoom-x.de/j/66945147179?pwd=UE5teFZOSEJKZXBTb3VYWlRzZVdqZz09
premeeting slides (ss24) have been uploaded
premeeting slides with potential projects (ss24) have been uploaded
The course will take place in the presence.
Time & Date
- Lecture & exercises (assignment phase): Wednesdays 2pm to 4pm. Tutor sessions 4pm to 6pm.
- Individual weekly meetings (project phase): Fixed 30min time slot for each project/group, preferably Wednesdays between 2pm and 6pm. Arranged after the assignment phase.
- Project presentations: TBA
- Project report due: TBA
- Maybe: Use of computer labs for exercise and project work.
Prerequisites
To participate in the course you need to fulfill the following requirements:
- Good knowledge of the C/C++ language is essential
- Good knowledge of basic mathematics such as linear algebra, calculus, and numerics is required
- Participation in at least one of the following lectures of the TUM Computer Vision Group:
- Computer Vision I: Variational Methods
- Computer Vision II: Multiple View Geometry
- Similar lectures can also be accepted, please contact us.
Pre-meeting and Registration
Places are assigned through TUM matching system. Please see http://docmatching.in.tum.de/ for the general procedure and for important dates (matching registration deadline is 14.02.2024).
TUMOnline course entry: Vision-based Navigation (IN2106)
Pre-meeting for more information about the course content and procedure will be held online on 07.02.2024 at 2 pm. Attendance to the pre-meeting is not required for participation in the course, but registration through the matching system is.
You are required to send information about your prior experience to verify prerequisites before the end of the matching deadline (14.02.2024). Please consult the pre-meeting slides for instructions on what information to send.
Course Description
Vision-based localization, mapping, and navigation have recently seen tremendous progress in computer vision and robotics research. Such methods already have a strong impact on applications in fields such as robotics and augmented reality.
In this course, students will develop and implement algorithms for visual navigation and 3D-reconstruction, relevant for applications such as autonomous navigation of wheeled robots and quadrocopters, tracking of handheld devices, or 3D reconstruction. The investigated algorithms may include, visual odometry, structure from motion, simultaneous localization and mapping with monocular, stereo, or RGB-D cameras, (semi-)dense 3D reconstruction.
Number of participants: max. 12
Course Layout
- Lecture & Exercise : up to 2 hours per week lecture; 2 hours per week tutored Q&A and exercise session, Wednesdays from 4pm to 6pm. There are 5 lectures & exercise sessions. Each week, the exercise for the following week will be announced and has to be handed in online by each student individually within 2 weeks. Attendance to lecture and tutor sessions are voluntary but highly encouraged.
- Project : After the initial 5 weeks, students should form groups of 1-2. Each group will be assigned to a project. Students can work on their own and consult the tutors in a weekly meeting to discuss project progress and next steps. Attendance to meetings with tutors is mandatory.
- Presentation and demo : Each group will be assigned a time slot on one of the last days of the semester to present their results, followed by a Q&A session. The presentation shall be 12 minutes long + 3 minutes for questions. The presentation should comprise 5-10 slides to explain the project goals and results to fellow students and may include a short live demo or video.
- Project Report : Each group writes a report on their project work (10-12 pages, single column, single-spaced lines, 11pt font size; title page, table of content, and references will not be accounted for in the page numbers). The report should summarize the project goals, what was implemented, and what results were obtained.
Grading
The final grade will be determined by both the programming assignments and the project.
For grading the programming assignments we consider completeness and timely submission (not so many things such as code quality, as long as it works). Note that you have to complete all exercise sheets to pass the course, even if you miss a submission deadline.
In the project phase, the main focus is on your implementation, on the presentation and the report, but we will also consider how you approach the problem, how you engage with your tutors, and how you manage your time.
Tutor Sessions
We will start with general announcements, then do a common Q&A session for the latest lecture as well as the current and the previous exercise sheet. The usefulness for everyone depends on you asking questions.
Afterward, you are encouraged to work on the exercises. The call will remain open and you can talk to us about any issues that come up. While you can also work on the exercises in your own time, we encourage you to make use of the tutor sessions as much as possible, as most questions – that might hold you up otherwise – can usually be resolved quickly during the tutor session.
In case you cannot attend, you may also send us questions (on the latest lecture as well as the current and the previous exercise sheet) to visnav-ss24@vision.in.tum.de and we will try to address them in the Q&A session. Please send us the questions at least one day in advance, if possible.
Projects
After the lecture and assignment phase is completed, students work in groups of 1-2 people on a more open-ended project. We will present some example projects, but you may also suggest your own.
Literature
A good introduction to many aspects of computer vision relevant for the practical project is the following course, which has recordings on YouTube:
- Computer Vision II: Multiple View Geometry, https://vision.in.tum.de/teaching/online/mvg
The following book also covers many aspects. You should focus on Part II and III and selected background from Part I as needed:
- Timothy D. Barfoot, "State Estimation for Robotics", July 2017, Cambridge University Press
- Free pdf available: http://asrl.utias.utoronto.ca/~tdb/bib/barfoot_ser17.pdf
Less relevant, but still helpful:
- Autonomous Navigation for Flying Robots (Online lectures and EdX course), https://vision.in.tum.de/teaching/online/visnavfly
- Computer Vision I: Variational Methods, https://vision.in.tum.de/teaching/online/cvvm
Selected publications:
- Edward Rosten et al., "Faster and better: a machine learning approach to corner detection" (https://arxiv.org/pdf/0810.2434.pdf)
- Michael Calonder et al., "BRIEF: Binary Robust Independent Elementary Features" (https://infoscience.epfl.ch/record/149242/files/top_1.pdf
- Ethan Rublee et al., "ORB: an efficient alternative to SIFT or SURF" (http://www.willowgarage.com/sites/default/files/orb_final.pdf)
- Raúl Mur-Artal et al., "ORB-SLAM: A Versatile and Accurate Monocular SLAM System" (http://webdiis.unizar.es/~raulmur/MurMontielTardosTRO15.pdf)
- Ethan Eade, "Lie Groups for 2D and 3D Transformations" (http://ethaneade.com/lie.pdf) –> compare also chapter 7 in the Barfoot book mentioned above