Generate multi-person talking videos on Hugging Face

SkillCurb·Aug 2025

Demo summary

The user demonstrates how to use the Multi-Talk interface on Hugging Face by entering a detailed text prompt, uploading a conditioning image, and adding two separate audio files in WAV format to generate a video of two people singing.

Step-by-step

Enter a detailed text prompt describing the scene and the characters
Upload a conditioning image
Upload two separate audio files in WAV format
Open Advanced Settings to view or adjust sample steps
Click to generate the video

Options

Use ChatGPT to write the detailed prompt for you
Increase the sample steps in Advanced Settings

Watch out for

Audio files must be in WAV format
The output video quality may not be very high

Tips

Stick with the default 12 sample steps
Use WAV format for better voice quality
Provide a very detailed prompt describing character appearance and positioning

Highlights

“overall I'm really impressed by this by this multi-dog model”

AI Avatar Video Generator

6:211:01Generate multi-person talking videos on Hugging FaceCurrentThe user demonstrates how to use the Multi-Talk interface on Hugging Face by entering a detailed text prompt, uploading a conditioning image, and adding two separate audio files in WAV format to generate a video of two people singing.SkillCurb