IMotionModel

The IMotionModel interface is responsible for providing initial pose estimates for new frames in the visual odometry pipeline. It can utilize various sources of information including flow, depth, and previous pose estimates.

Interface

class IMotionModel(ABC, Generic[T_Data], ConfigTestableSubclass):
    @abstractmethod
    def predict(self, frame: T_Data, flow: torch.Tensor | None, depth: torch.Tensor | None) -> pp.LieTensor: ...
    
    @abstractmethod
    def update(self, pose: pp.LieTensor) -> None: ...

Methods to Implement

predict(frame, flow, depth) -> pp.LieTensor
- Core method for pose prediction
- Parameters:
  - frame: Current sensor frame
  - flow: Optional optical flow estimation
  - depth: Optional depth estimation
- Returns SE3 pose in world coordinates as pypose.LieTensor
update(pose) -> None
- Updates internal state with optimized pose
- Called after pose graph optimization
- Used to refine future predictions

Implementations

Base Models

StaticMotionModel
- Assumes static camera position
- Returns previous frame's pose
- No configuration required
GTMotionwithNoise
- Uses ground truth motion with optional noise
- Requires ground truth poses in input frames
- Configuration:
  - noise_std: Standard deviation of noise (0.0 for no noise)
ReadPoseFile
- Uses external pose file for motion estimation
- Calculates relative motion from poses
- Applies motion to current trajectory
- Configuration:
  - pose_file: Path to pose file (supports .npy, .pt, .pth, .txt)

Advanced Models

TartanMotionNet
- Main motion model used in MAC-VO
- Uses TartanVO's motion estimation network
- Requires flow and depth for prediction
- Configuration:
  - weight: Path to model weights
  - device: Target device ("cuda" or "cpu")

Usage in MAC-VO

The IMotionModel interface is used in two main contexts:

Initialization

# Initial pose estimation
est_pose = motion_model.predict(frame0, None, depth0.depth)

Frame-to-Frame Tracking

# Update with optimized pose
motion_model.update(optimized_pose)

# Predict next frame's pose
est_pose = motion_model.predict(frame1, flow, depth)

Motion Estimation Process

Base models use simple strategies:
- Static assumption
- Ground truth with noise
- External pose file
Advanced models consider:
- Optical flow estimation
- Depth estimation
- Previous pose estimates
- Learned motion patterns
All models:
- Return poses in world coordinates
- Can be updated with optimized poses
- Support both stereo and stereo-inertial frames

File Format Support

The ReadPoseFile implementation supports multiple pose file formats:

.npy: NumPy array files
.pt/.pth: PyTorch tensor files
.txt: Text files with space-separated values

Expected format:

N×7 matrix where N is number of frames
Each row contains SE3 pose parameters
Compatible with pypose.SE3 format

Interface​

Methods to Implement​

Implementations​

Base Models​

Advanced Models​

Usage in MAC-VO​

Motion Estimation Process​

File Format Support​