Netflix just dropped their first public model on Hugging Face: VOID
7 hours ago
- #AI-video-editing
- #object-removal
- #video-inpainting
- VOID removes objects from videos along with all interactions they induce on the scene, including physical interactions like objects falling when a person is removed.
- It is built on CogVideoX-Fun-V1.5-5b-InP, fine-tuned for video inpainting with interaction-aware quadmask conditioning, and requires models like void_pass1.safetensors for base inpainting.
- Usage can be done via a provided notebook or CLI, with input requiring video, quadmask (generated by a pipeline using SAM2 + Gemini), and a text prompt describing the scene after removal.
- Training utilized paired counterfactual videos from HUMOTO (human-object interactions in Blender) and Kubric (object-only interactions), run on 8x A100 80GB GPUs.