Text-to-Video generation with Hunyuan Video 1.5 in ComfyUI

Demo summary
The creator demonstrates a ComfyUI workflow using Hunyuan Video 1.5 (FP16) and Qwen2.5-VL for prompt generation to create a 720p video of a horse and rider.
Step-by-step
- Load the Hunyuan Video 1.5 main model, selecting FP16 for quality or FP8 if you have low VRAM.
- Enable Sage Attention to increase generation speed and quality.
- Load the dual text encoders, specifically selecting the Qwen2.5-VL and BYT5 models.
- Set the model type to 'hunyuan VIDEO 15' in the encoder node.
- Configure the latent node resolution to 1280x720 with a frame length of 33.
- Set the CFG to 6.0 and the Shift value to 9 for 720p generation.
- Configure the KSampler with 20 steps, the Euler sampler, and the Simple scheduler.
Options
- Use FP8 instead of FP16 to avoid out-of-memory errors.
- Use 50 sampling steps for higher quality if time is not a concern.
- Write prompts manually or use the Qwen VL model to generate them from an image.
Watch out for
- You must select 'hunyuan VIDEO 15' in the text encoder type or it will not work correctly.
- The Shift value must be adjusted according to the specific resolution list (e.g., 9 for 720p).
Tips
- Enable Sage Attention for a faster and higher-quality result.
- Use 20 sampling steps instead of 50 to save significant time while maintaining good results.
- Refer to the official GitHub guide for the specific prompt engineering format.
- Use a prompt engineering workflow with Qwen VL for better descriptive prompts.
Highlights
“we can then achieve a fantastic effect”
All demos from “极简工作流!腾讯混元 1.5 (Hunyuan Video) 首发测评:原生 1080P + 显存优化全攻略”
5:282:26Text-to-Video generation with Hunyuan Video 1.5 in ComfyUICurrentThe creator demonstrates a ComfyUI workflow using Hunyuan Video 1.5 (FP16) and Qwen2.5-VL for prompt generation to create a 720p video of a horse and rider.ComfyUI· AI Animation Generator
7:542:071080P Latent Upscaling with Hunyuan SR ModelThe video shows how to use the Hunyuan Video Latent Upscale node and the SR (Super Resolution) model to enhance a 720p video to 1080p using a dual-sampler noise injection technique.ComfyUI· AI Video Upscaler
10:012:01Image-to-Video generation using Hunyuan 1.5 DistilledThe user demonstrates an I2V workflow in ComfyUI using the Hunyuan 1.5 distilled model, Google's Clip Vision for image encoding, and the 'image to video' node to animate a portrait.ComfyUI· Image to Video- Watch “极简工作流!腾讯混元 1.5 (Hunyuan Video) 首发测评:原生 1080P + 显存优化全攻略” →
AI Animation Generator
2:070:23Load Wan 2.2 Animate workflow and install nodesThe user demonstrates how to drag and drop the Wan 2.2 Animate workflow into ComfyUI and use the Manager to install missing custom nodes.MDMZ
2:551:05Configure video input and output settingsThe demo shows how to upload a source video to ComfyUI, set the frame count, and adjust the output dimensions to match the original aspect ratio.MDMZ
16:361:56Setting up HunyuanVideo 1.5 in ComfyUIThe video demonstrates how to update ComfyUI and import the HunyuanVideo 1.5 JSON workflow files to create a node-based generation environment.AI Search
20:151:53Text-to-Video generation in ComfyUIA step-by-step demo of configuring the Hunyuan nodes in ComfyUI, entering a prompt for a 'giant cat', and rendering the final 720p video.AI Search
29:151:33Running HunyuanVideo with GGUF (Low VRAM)The video shows how to use the GGUF loader node to run a compressed version of HunyuanVideo 1.5, enabling video generation on GPUs with as little as 6GB of VRAM.AI Search
0:594:52Configure LTX 2.3 in ComfyUIThe creator walks through the ComfyUI node setup for LTX 2.3, explaining the GGUF model loader, VAE settings, and how to adjust resolution and frame counts for optimal rendering.AIKnowledge2Go
1:230:41Setting up LTX-2.3 in ComfyUIThe creator demonstrates how to browse templates in ComfyUI, search for LTX 2.3, and download the required missing models for the text-to-video workflow.MDMZ
13:071:21Applying LoRA to Wan 2.2 video generationThe video shows how to integrate a LoRA (Low-Rank Adaptation) into a Wan 2.2 workflow to achieve specific cinematic movements like a face zoom.pixaroma
31:133:44Text-to-video with LTX-2The video walks through setting up the LTX-2 model in ComfyUI to generate high-resolution video clips from text prompts and images.pixaroma
39:075:27Cloud-based ComfyUI on RunPod/RunHubThe video shows how to run complex video workflows in the cloud using RunHub AI, demonstrating the interface and execution of InfiniteTalk and Wan 2.2 without local hardware.pixaroma
2:351:46Configure Infinite Talk models in ComfyUIThe creator demonstrates how to organize the necessary models within the ComfyUI workflow, including the Lightning LoRA, quantized Infinite Talk UNET models, and the Wan 2.1 VAE and Clip Vision nodes.Aiconomist
11:060:26Combining audio, images, and promptsA demonstration of layering a specific action prompt (patting stomach) over a specific audio timestamp to create a fully directed AI scene.What Dreams Cost
ComfyUI