MAC-VO: Metrics-Aware Covariance for Learning-based Stereo Visual Odometry
Carnegie Mellon University
* Equal Contribution.
Abstract
We propose MAC-VO, a novel learning-based stereo VO that leverages the learned metrics-aware matching uncertainty for dual purposes: selecting keypoint and weighing the residual in pose graph optimization. Compared to traditional geometric methods prioritizing texture-affluent features like edges, our keypoint selector employs the learned uncertainty to filter out the low-quality features based on global inconsistency. In contrast to the learning-based algorithms that rely on the scale-agnostic weight matrix, we design a metrics-aware spatial covariance model to capture the spatial information during keypoint registration. Integrating this covariance model into pose graph optimization enhances the robustness and reliability of pose estimation, particularly in challenging environments with varying illumination, feature density, and motion patterns. On public benchmark datasets, MAC-VO outperforms existing VO algorithms, even some SLAM algorithms in challenging environments. The covariance-aware framework also provides valuable information about the reliability of the estimated poses, which can benefit decision-making for autonomous systems.
MAC-VO Dense Mapping Mode
MAC-VO supports the "dense mapping" mode. Leveraging the uncertainty prediction network, we can conveniently select reliable depth estimations for dense mapping without bundle adjustment / multi-frame optimization. The following video shows the dense mapping result on EuRoC, VBR, TartanAir, and TartanAir v2. No post-processing is applied.
Zed X Fire Academy 2
VBR Diag Train 0
VBR Spagna Test 0 (Dynamic Scene)
VBR Spagna Test 0 (2) (Dynamic Scene)
VBR Colosseo Train 0 (Extreme Exposure)
TartanAir v2 - Abandon School 1
TartanAir - Abandon Factory 0
EuRoC V102
TartanAir v2 Test (Easy Subset) 3
Zed X Fire Academy 1
Intro Video
Methods
System Pipeline

Metrics-Aware Spatial Covariance
