Testing SAM2 on motion-blurred footage

Nov 28, 2024 Segmentation SAM2

I've been testing SAM2's performance on production VFX plates with heavy motion blur—specifically, fast-moving vehicles and action sequences shot with wide-open shutters (180-degree or higher). The results reveal interesting edge cases that matter for production deployment.

The Problem

Motion blur is everywhere in live-action VFX. When a subject moves quickly relative to the camera, pixels smear across multiple positions in a single frame. This makes segmentation challenging: the model needs to understand that the blurred pixels belong to the moving subject, not the background.

SAM2 handles moderate motion blur well (up to ~90-degree shutter equivalent), but I noticed temporal consistency degradation on plates with extreme blur. Specifically:

Test Setup

I tested on three sequences:

  1. Car chase plate: Vehicle moving at 60 mph, shot at 24fps with 180-degree shutter
  2. Action sequence: Actor performing fast martial arts move, shot at 24fps with 270-degree shutter (intentionally exaggerated)
  3. Controlled synthetic test: CG render of a sphere moving at varying speeds, with adjustable motion blur amount

For each test, I recorded:

Results & Observations

Finding 1: SAM2's temporal consistency drops significantly above 180-degree shutter angles. On the 270-degree martial arts sequence, IoU between adjacent frames dropped from 0.92 (at 90 degrees) to 0.74 (at 270 degrees).

Finding 2: The model tends to exclude motion blur trails from the mask. This is actually desirable in many compositing scenarios (you can reintroduce blur post-segmentation with a MotionBlur node), but artists need to be aware of this behavior.

Finding 3: Synthetic motion blur (CG-rendered) performs better than optical motion blur (from camera shutter). This suggests the training data may include more synthetic examples.

Workarounds

I'm testing two approaches to improve results on heavily blurred footage:

  1. Frame interpolation preprocessing: Use RIFE or similar optical flow models to generate intermediate frames, effectively reducing the perceived motion blur. Segment on the interpolated frames, then downsample the masks back to the original frame rate. Early tests show promise—IoU improves to 0.88 on the 270-degree test.
  2. Temporal smoothing post-processing: Apply a rolling median filter to mask pixels across 3-5 frames. This reduces jitter but can lag behind fast direction changes. Works well for simple motion paths (vehicles, linear action) but struggles with complex choreography.

Next Steps

I'm planning to:

Overall, SAM2 is still incredibly useful for roto work, but artists should be aware that extreme motion blur remains an edge case. For those sequences, expect to add more interactive prompts and plan for manual cleanup.