Skip to main content

IKeypointSelector

The IKeypointSelector interface is responsible for selecting keypoints from stereo frames for tracking and mapping. It can utilize depth, flow, and uncertainty information to make intelligent selections.

Interface

class IKeypointSelector(ABC, ConfigTestableSubclass):
@abstractmethod
def select_point(
self,
frame: StereoData,
numPoint: int,
depth0_est: IStereoDepth.Output,
depth1_est: IStereoDepth.Output,
match_est: IMatcher.Output | None,
) -> torch.Tensor: ...

Output Format

  • Returns a torch.Tensor of shape (N, 2) containing selected keypoint coordinates
  • Coordinates are in (u, v) format where:
    • u: horizontal coordinate (x)
    • v: vertical coordinate (y)
  • To access image values at keypoint locations: image[kp[..., 1], kp[..., 0]]

Methods to Implement

  • select_point(...) -> torch.Tensor
    • Core method for keypoint selection
    • Parameters:
      • frame: Current stereo frame
      • numPoint: Target number of keypoints (may not be strictly followed)
      • depth0_est: Depth estimation for frame 0
      • depth1_est: Depth estimation for frame 1
      • match_est: Optional flow estimation between frames
    • Returns tensor of selected keypoint coordinates

Implementations

Base Selectors

  • RandomSelector

    • Uniformly random selection within valid image region
    • Configuration:
      • mask_width: Border width to exclude
      • device: Target device ("cuda" or "cpu")
  • GradientSelector

    • Selects points with high image gradient
    • Uses Sobel filter for gradient computation
    • Configuration:
      • mask_width: Border width to exclude
      • grad_std: Gradient threshold multiplier
  • SparseGradientSelector

    • Similar to GradientSelector but ensures spatial distribution
    • Applies non-maximum suppression (NMS) to enforce sparsity
    • Configuration:
      • mask_width: Border width to exclude
      • grad_std: Gradient threshold multiplier
      • nms_size: Size of NMS kernel (must be odd)
  • GridSelector

    • Deterministic uniform grid-based selection
    • Used for benchmarking and reproducible results
    • Configuration:
      • mask_width: Border width to exclude
      • device: Target device ("cuda" or "cpu")

Advanced Selectors

  • CovAwareSelector

    • Main keypoint selector used in MAC-VO
    • Selects points based on depth, depth uncertainty, and flow uncertainty
    • Implements selection strategy from MAC-VO paper Section III.B
    • Configuration:
      • device: Target device ("cuda" or "cpu")
      • mask_width: Border width to exclude
      • max_depth: Maximum valid depth ("auto" or positive float)
      • kernel_size: NMS kernel size (must be odd)
      • max_depth_cov: Maximum depth uncertainty
      • max_match_cov: Maximum flow uncertainty
  • CovAwareSelector_NoDepth

    • Modified version of CovAwareSelector without depth constraints
    • Uses only flow uncertainty for selection
    • Falls back to GridSelector if no flow uncertainty available
    • Configuration:
      • device: Target device ("cuda" or "cpu")
      • mask_width: Border width to exclude
      • kernel_size: NMS kernel size (must be odd)
      • max_match_cov: Maximum flow uncertainty

Meta Selectors

  • SelectorCompose
    • Combines multiple selectors with weighted distribution
    • Configuration:
      • selector_args: List of selector configurations
      • weight: List of weights for each selector

Usage in MAC-VO

The IKeypointSelector interface is used in two main contexts:

  1. Tracking: Selecting points for frame-to-frame tracking

    kp0_uv = keypoint_selector.select_point(frame0.stereo, num_points, depth0, depth1, match01)
  2. Mapping: Selecting points for map building

    map_points = map_selector.select_point(frame.stereo, num_points, depth_est, depth_est, None)

Selection Process

  1. Base selectors use simple strategies (random, gradient, grid)
  2. Advanced selectors consider:
    • Image borders (mask_width)
    • Depth constraints (max_depth)
    • Uncertainty thresholds (max_depth_cov, max_match_cov)
    • Spatial distribution (NMS)
  3. Meta selectors combine multiple strategies
info

The numPoint hint may not be followed strictly by the selector. Number of keypoint will fluctuate based on different selection strategy and the input conditions.

warning

Keypoints in this codebase are always arranged in (u, v) format. This means that you need to output the index of keypoints in different coordinate system as pytorch. Use image[kp[..., 1], kp[..., 0]] to read value of image on all u-v coords of keypoints.