Generate 480p talking video in ComfyUI

ComfyUI Try it →Watch full video →SECourses · Jul 2025

Demo summary

The creator demonstrates loading a static image and audio file into a ComfyUI workflow using the MultiTalk node to generate a 10-second talking animation.

Step-by-step

Drag and drop the 480p 10-second workflow into ComfyUI
Load your input image and audio file into the respective nodes
Set the width and height parameters to match your input image resolution
Update the positive and negative prompts to describe your specific image and desired performance
Calculate the required number of frames by multiplying the audio duration in seconds by 25
Enter the calculated frame count into the workflow
Click Queue Prompt to start the generation

Options

Enable VAE tiling or Tiled VAE if you have low VRAM
Increase 'Number of blocks to swap' up to 40 if you are low on GPU
Use MediaInfo to get the exact millisecond duration of your audio file

Watch out for

This specific workflow degrades in quality for videos longer than 10 seconds
The sampler is locked to four steps and cannot be changed
The model is natively made for 81 frames and uses an embedding system to reach 250 frames
High RAM is required; at least 100 GB of virtual RAM (page file) is recommended if physical RAM is insufficient

Tips

Start with the provided test.jpeg and test.mp3 to verify the workflow is working
Use 'nvitop' in the command line to monitor GPU watt usage; if usage is low, you may be using shared VRAM instead of dedicated VRAM
If you get out of memory errors, check the CMD terminal window to confirm VRAM issues
For videos longer than 10 seconds, use a long context generation workflow instead

Highlights

“once you set frame count, your prompt, your input image and its resolution, actually you are ready. You don't need to change anything else.”

All demos from “MultiTalk Full Tutorial With 1-Click Installer - Make Talking and Singing Videos From Static Images”

7:406:12Generate 480p talking video in ComfyUICurrentThe creator demonstrates loading a static image and audio file into a ComfyUI workflow using the MultiTalk node to generate a 10-second talking animation.ComfyUI· AI Animation Generator
13:524:06High-quality 720p long context generationA demonstration of the 720p long context workflow in ComfyUI, showing how to adjust resolution, prompt, and block swap parameters for higher fidelity output.ComfyUI· AI Animation Generator
18:560:49Side-by-side video quality comparisonThe creator uses an 'Ultimate Video Upscaler' tool to perform a side-by-side comparison between the 480p and 720p generated outputs.ComfyUI· AI Video Upscaler
47:445:08Running MultiTalk on RunPodThe creator walks through executing the MultiTalk high-quality workflow on a RunPod instance, monitoring VRAM usage with nvitop while generating the video.ComfyUI· AI Animation Generator
Watch “MultiTalk Full Tutorial With 1-Click Installer - Make Talking and Singing Videos From Static Images” →

AI Animation Generator

2:070:23Load Wan 2.2 Animate workflow and install nodesThe user demonstrates how to drag and drop the Wan 2.2 Animate workflow into ComfyUI and use the Manager to install missing custom nodes.MDMZ
2:551:05Configure video input and output settingsThe demo shows how to upload a source video to ComfyUI, set the frame count, and adjust the output dimensions to match the original aspect ratio.MDMZ
16:361:56Setting up HunyuanVideo 1.5 in ComfyUIThe video demonstrates how to update ComfyUI and import the HunyuanVideo 1.5 JSON workflow files to create a node-based generation environment.AI Search
20:151:53Text-to-Video generation in ComfyUIA step-by-step demo of configuring the Hunyuan nodes in ComfyUI, entering a prompt for a 'giant cat', and rendering the final 720p video.AI Search
29:151:33Running HunyuanVideo with GGUF (Low VRAM)The video shows how to use the GGUF loader node to run a compressed version of HunyuanVideo 1.5, enabling video generation on GPUs with as little as 6GB of VRAM.AI Search
0:594:52Configure LTX 2.3 in ComfyUIThe creator walks through the ComfyUI node setup for LTX 2.3, explaining the GGUF model loader, VAE settings, and how to adjust resolution and frame counts for optimal rendering.AIKnowledge2Go
1:230:41Setting up LTX-2.3 in ComfyUIThe creator demonstrates how to browse templates in ComfyUI, search for LTX 2.3, and download the required missing models for the text-to-video workflow.MDMZ
13:071:21Applying LoRA to Wan 2.2 video generationThe video shows how to integrate a LoRA (Low-Rank Adaptation) into a Wan 2.2 workflow to achieve specific cinematic movements like a face zoom.pixaroma
31:133:44Text-to-video with LTX-2The video walks through setting up the LTX-2 model in ComfyUI to generate high-resolution video clips from text prompts and images.pixaroma
39:075:27Cloud-based ComfyUI on RunPod/RunHubThe video shows how to run complex video workflows in the cloud using RunHub AI, demonstrating the interface and execution of InfiniteTalk and Wan 2.2 without local hardware.pixaroma
2:351:46Configure Infinite Talk models in ComfyUIThe creator demonstrates how to organize the necessary models within the ComfyUI workflow, including the Lightning LoRA, quantized Infinite Talk UNET models, and the Wan 2.1 VAE and Clip Vision nodes.Aiconomist
11:060:26Combining audio, images, and promptsA demonstration of layering a specific action prompt (patting stomach) over a specific audio timestamp to create a fully directed AI scene.What Dreams Cost