V1 inference-only Space. It runs SCAIL-2 from prepared inputs: reference image, reference mask, driving/rendered video, and driving mask video. SCAIL-Pose/SAM3 preprocessing is intentionally not imported in this version.
On ZeroGPU, each generation should be treated as a cold start: the GPU can be released between calls, so the 14B stack may need to be loaded again before sampling.