Testing SAM2 on motion-blurred footage

I've been testing SAM2's performance on production VFX plates with heavy motion blur—specifically, fast-moving vehicles and action sequences shot with wide-open shutters (180-degree or higher). The results reveal interesting edge cases that matter for production deployment.

The Problem

Motion blur is everywhere in live-action VFX. When a subject moves quickly relative to the camera, pixels smear across multiple positions in a single frame. This makes segmentation challenging: the model needs to understand that the blurred pixels belong to the moving subject, not the background.

SAM2 handles moderate motion blur well (up to ~90-degree shutter equivalent), but I noticed temporal consistency degradation on plates with extreme blur. Specifically:

Masks would "jitter" between frames, sometimes including the blur trail, sometimes excluding it
Fine details (like hair or clothing edges) would flicker as the model struggled to decide if blurred pixels were foreground or background
The model sometimes "lost" the subject entirely on the most blurred frames, requiring manual re-prompting

Test Setup

I tested on three sequences:

Car chase plate: Vehicle moving at 60 mph, shot at 24fps with 180-degree shutter
Action sequence: Actor performing fast martial arts move, shot at 24fps with 270-degree shutter (intentionally exaggerated)
Controlled synthetic test: CG render of a sphere moving at varying speeds, with adjustable motion blur amount

For each test, I recorded:

Temporal consistency score (IoU between adjacent frame masks)
Edge quality (manual artist rating)
Number of frames requiring re-prompting

Results & Observations

Finding 1: SAM2's temporal consistency drops significantly above 180-degree shutter angles. On the 270-degree martial arts sequence, IoU between adjacent frames dropped from 0.92 (at 90 degrees) to 0.74 (at 270 degrees).

Finding 2: The model tends to exclude motion blur trails from the mask. This is actually desirable in many compositing scenarios (you can reintroduce blur post-segmentation with a MotionBlur node), but artists need to be aware of this behavior.

Finding 3: Synthetic motion blur (CG-rendered) performs better than optical motion blur (from camera shutter). This suggests the training data may include more synthetic examples.

Workarounds

I'm testing two approaches to improve results on heavily blurred footage:

Frame interpolation preprocessing: Use RIFE or similar optical flow models to generate intermediate frames, effectively reducing the perceived motion blur. Segment on the interpolated frames, then downsample the masks back to the original frame rate. Early tests show promise—IoU improves to 0.88 on the 270-degree test.
Temporal smoothing post-processing: Apply a rolling median filter to mask pixels across 3-5 frames. This reduces jitter but can lag behind fast direction changes. Works well for simple motion paths (vehicles, linear action) but struggles with complex choreography.

Next Steps

I'm planning to:

Test SAM2.1 (if released) to see if temporal modeling improves
Build a hybrid pipeline: use frame interpolation for heavily blurred sequences, native SAM2 for everything else
Collect more production plates with motion blur to validate these workarounds at scale

Overall, SAM2 is still incredibly useful for roto work, but artists should be aware that extreme motion blur remains an edge case. For those sequences, expect to add more interactive prompts and plan for manual cleanup.