Direkt zum Inhalt springen
Computer Vision Group
TUM School of Computation, Information and Technology
Technical University of Munich

Technical University of Munich

Menu

Links

Informatik IX
Computer Vision Group

Boltzmannstrasse 3
85748 Garching info@vision.in.tum.de

Follow us on:

YouTube X / Twitter Facebook

News

03.07.2024

We have seven papers accepted to ECCV 2024. Check our publication page for more details.

09.06.2024
GCPR / VMV 2024

GCPR / VMV 2024

We are organizing GCPR / VMV 2024 this fall.

04.03.2024

We have twelve papers accepted to CVPR 2024. Check our publication page for more details.

18.07.2023

We have four papers accepted to ICCV 2023. Check out our publication page for more details.

02.03.2023

CVPR 2023

We have six papers accepted to CVPR 2023. Check out our publication page for more details.

More


Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
research:vslam:dvso [2018/08/24 13:51]
yangn
research:vslam:dvso [2018/11/14 17:20]
yangn
Line 1: Line 1:
 ===== Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry ===== ===== Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry =====
  
-**Contact:** [[members:yangn|Nan Yang]]+**Contact:** [[members:yangn|Nan Yang]], [[members:wangr|Rui Wang]], [[https://www.is.mpg.de/person/jstueckler|Jörg Stückler]], [[members:cremers|Prof. Daniel Cremers]]
  
 <html><center><iframe width="640" height="360" <html><center><iframe width="640" height="360"
Line 9: Line 9:
  
 ==== Abstract ==== ==== Abstract ====
-Monocular visual odometry approaches that purely rely on geometric cues are prone to scale drift and require sufficient motion parallax in successive frames for motion estimation and 3D reconstruction. In this paper, we propose to leverage deep monocular depth prediction to overcome limitations of geometry-based monocular visual odometry. To this end, we incorporate deep depth predictions into Direct Sparse Odometry (DSOas direct virtual stereo measurements. For depth prediction, we design a novel deep network that refines predicted depth from a single image in a two-stage process. We train our network in a semi-supervised way on photoconsistency in stereo images and on consistency with accurate sparse depth reconstructions from Stereo DSO. Our deep predictions excel state-of-the-art approaches for monocular depth on the KITTI benchmark. Moreover, our Deep Virtual Stereo Odometry clearly exceeds previous monocular and deep-learning based methods in accuracy. It even achieves comparable performance to the state-of-the-art stereo methods, while only relying on a single camera.+Monocular visual odometry approaches that purely rely on geometric cues are prone to scale drift and require sufficient motion parallax in successive frames for motion estimation and 3D reconstruction. In this paper, we propose to leverage deep monocular depth prediction to overcome limitations of geometry-based monocular visual odometry. To this end, we incorporate deep depth predictions into [[:research:vslam:dso|DSO]] as direct virtual stereo measurements. For depth prediction, we design a novel deep network that refines predicted depth from a single image in a two-stage process. We train our network in a semi-supervised way on photoconsistency in stereo images and on consistency with accurate sparse depth reconstructions from [[:research:vslam:stereo-dso|Stereo DSO]]. Our deep predictions excel state-of-the-art approaches for monocular depth on the KITTI benchmark. Moreover, our Deep Virtual Stereo Odometry clearly exceeds previous monocular and deep-learning based methods in accuracy. It even achieves comparable performance to the state-of-the-art stereo methods, while only relying on a single camera.
  
 {{:research:vslam:dvso:teaser_pic.png?640|}} {{:research:vslam:dvso:teaser_pic.png?640|}}
Line 22: Line 22:
  
 {{:research:vslam:dvso:system_overview_long_new.png?640|}} {{:research:vslam:dvso:system_overview_long_new.png?640|}}
 +
 +==== Results ====
 +We quantitatively evaluate our StackNet with other state-of-the-art monocular depth prediction methods on the publicly available KITTI dataset. For DVSO, we evaluate its tracking accuracy on the KITTI odometry benchmark with other state-of-the-art monocular as well as stereo visual odometry systems. In the [[:research:vslam:dvso|supplementary material]], we also show the generalization ability of StackNet as well as DVSO.
 +
 +=== Monocular Depth Estimation ===
 +
 +{{:research:vslam:dvso:depth_table.png?640|}}
 +
 +{{:research:vslam:dvso:depth_comparison.png?640|}}
 +
 +=== Monocular Visual Odometry ===
 +
 +{{:research:vslam:dvso:vo_table.png?640|}}
 +
 +{{:research:vslam:dvso:vo_error.png?640|}}
 +
 +{{:research:vslam:dvso:traj_01.png?640|}}
 +
 +==== Download ====
 +Trajectories of DVSO on KITTI 00-10: 
 +Depth Estimations of StackNet on the test set of Eigen Split: 
 +
 +==== Publications ====
 +<bibtex>
 +<keywords>dvso|directsparseodometry|stereodso</keywords>
 +</bibtex>
  

Rechte Seite

Informatik IX
Computer Vision Group

Boltzmannstrasse 3
85748 Garching info@vision.in.tum.de

Follow us on:

YouTube X / Twitter Facebook

News

03.07.2024

We have seven papers accepted to ECCV 2024. Check our publication page for more details.

09.06.2024
GCPR / VMV 2024

GCPR / VMV 2024

We are organizing GCPR / VMV 2024 this fall.

04.03.2024

We have twelve papers accepted to CVPR 2024. Check our publication page for more details.

18.07.2023

We have four papers accepted to ICCV 2023. Check out our publication page for more details.

02.03.2023

CVPR 2023

We have six papers accepted to CVPR 2023. Check out our publication page for more details.

More