demobook

ComfyUI: Generate temporal face swap with WAN Video

Demo summary

The user combines the reference identity, pose data, and masks into the WAN Video sampler to generate a temporally consistent video sequence.

Step-by-step

  1. Combine the reference identity, pose information, original background, and mask into the WAN Video sampler
  2. Run the sampler to generate the new sequence
  3. Decode the latent result back into image frames
  4. Reassemble the frames into a final video
  5. Import the original audio to sync with the new footage

Options

  • Re-import original audio to preserve source sync

All demos from “Learn How To: Face Swap

  1. 0:290:24Load source footage and models in ComfyUIThe user demonstrates importing source video footage and loading the necessary model nodes including WAN Video, VAE, and Clip Vision within the ComfyUI interface.ComfyUIAI Animation Generator
  2. 0:530:48Generate and refine head masks with Florence 2 and SAM 2The workflow shows using Florence 2 for object detection to target the head and SAM 2 for precise segmentation, including adjusting the 'grow mask expand' value to improve blending.ComfyUIAI Inpainting
  3. 1:410:46Prepare driving data and auto-prompts with Qwen2-VLThe demo shows running pose detection on source footage and using Qwen2-VL (referred to as Gwen VL) to generate a semantic text description from a reference image for the face swap.ComfyUIAI Face Swap Generator
  4. 2:270:35Generate temporal face swap with WAN VideoCurrentThe user combines the reference identity, pose data, and masks into the WAN Video sampler to generate a temporally consistent video sequence.ComfyUIAI Face Swap Video
  5. 3:280:27Simplified face swap using ComfyUI App ModeA demonstration of the simplified 'App Mode' interface where users can upload footage and a reference image to perform a face swap without interacting with the node graph.ComfyUIAI Face Swap Generator
  6. Watch “Learn How To: Face Swap” →

AI Face Swap Video

  1. 16:071:11Replace character face in generated videoA demonstration of a character replacement pass that swaps a face in a Sora-generated video with a specific reference photo to maintain identity consistency.Yaroflasher
  2. 2:270:35Generate temporal face swap with WAN VideoCurrentThe user combines the reference identity, pose data, and masks into the WAN Video sampler to generate a temporally consistent video sequence.ComfyUI