Skip to main content

IMotionModel

The IMotionModel interface is responsible for providing initial pose estimates for new frames in the visual odometry pipeline. It can utilize various sources of information including flow, depth, and previous pose estimates.

Interface

class IMotionModel(ABC, Generic[T_Data], ConfigTestableSubclass):
@abstractmethod
def predict(self, frame: T_Data, flow: torch.Tensor | None, depth: torch.Tensor | None) -> pp.LieTensor: ...

@abstractmethod
def update(self, pose: pp.LieTensor) -> None: ...

Methods to Implement

  • predict(frame, flow, depth) -> pp.LieTensor

    • Core method for pose prediction
    • Parameters:
      • frame: Current sensor frame
      • flow: Optional optical flow estimation
      • depth: Optional depth estimation
    • Returns SE3 pose in world coordinates as pypose.LieTensor
  • update(pose) -> None

    • Updates internal state with optimized pose
    • Called after pose graph optimization
    • Used to refine future predictions

Implementations

Base Models

  • StaticMotionModel

    • Assumes static camera position
    • Returns previous frame's pose
    • No configuration required
  • GTMotionwithNoise

    • Uses ground truth motion with optional noise
    • Requires ground truth poses in input frames
    • Configuration:
      • noise_std: Standard deviation of noise (0.0 for no noise)
  • ReadPoseFile

    • Uses external pose file for motion estimation
    • Calculates relative motion from poses
    • Applies motion to current trajectory
    • Configuration:
      • pose_file: Path to pose file (supports .npy, .pt, .pth, .txt)

Advanced Models

  • TartanMotionNet
    • Main motion model used in MAC-VO
    • Uses TartanVO's motion estimation network
    • Requires flow and depth for prediction
    • Configuration:
      • weight: Path to model weights
      • device: Target device ("cuda" or "cpu")

Usage in MAC-VO

The IMotionModel interface is used in two main contexts:

  1. Initialization

    # Initial pose estimation
    est_pose = motion_model.predict(frame0, None, depth0.depth)
  2. Frame-to-Frame Tracking

    # Update with optimized pose
    motion_model.update(optimized_pose)

    # Predict next frame's pose
    est_pose = motion_model.predict(frame1, flow, depth)

Motion Estimation Process

  1. Base models use simple strategies:

    • Static assumption
    • Ground truth with noise
    • External pose file
  2. Advanced models consider:

    • Optical flow estimation
    • Depth estimation
    • Previous pose estimates
    • Learned motion patterns
  3. All models:

    • Return poses in world coordinates
    • Can be updated with optimized poses
    • Support both stereo and stereo-inertial frames

File Format Support

The ReadPoseFile implementation supports multiple pose file formats:

  • .npy: NumPy array files
  • .pt/.pth: PyTorch tensor files
  • .txt: Text files with space-separated values

Expected format:

  • N×7 matrix where N is number of frames
  • Each row contains SE3 pose parameters
  • Compatible with pypose.SE3 format