Multitalk: Animate multiple speakers in a podcast scene

Demo summary
The video shows how to configure Multi-Talk for two speakers by uploading an image of two people and two sequential audio clips, assigning voices based on their position in the frame.
Step-by-step
- Upload an image containing two people.
- Select the number of speakers as two.
- Choose the option to play audio clips in a row (sequentially).
- Upload the first audio clip for the person on the left and the second audio clip for the person on the right.
- Enter a prompt describing the scene and the speakers.
- Adjust the number of frames to match the total duration of the combined audio clips.
- Enable TCH and set the speed to 2x to accelerate generation.
- Drag the TCH slider to the 10% mark.
- Click Generate.
Options
- Automatic detection of speakers from a single audio clip
- Parallel audio playback for simultaneous speaking
- TCH (Turbo) speed settings to reduce generation time
Watch out for
- The AI assumes the person on the left of the image speaks the first audio clip and the person on the right speaks the second.
- The number of frames must be manually set to match the total duration of the audio clips.
Tips
- Avoid the automatic speaker detection option as the AI often has a hard time figuring out which speaker says which part.
- Use the 'in a row' option for more reliable sequential dialogue.
- Only start applying TCH (Turbo) from the 10-second mark of the generation process.
Highlights
“you can see it animates this very well. The lip sync is very good.”
All demos from “Make AI videos with talking + pose + reference control. MultiTalk & VACE tutorial”
5:271:27Overview of the Wan2GP interface for Multi-TalkThe creator walks through the Wan2GP Gradio interface, explaining how to select the Multi-Talk model and the specific 'Vase Multi-Talk Fusion X' version for better performance on low VRAM.Multitalk· AI Animation Generator
8:224:37Generate talking head video from image and audioThe user demonstrates uploading a reference image and an audio clip to Multi-Talk, configuring background removal and text prompts to generate a video of a woman speaking in a park.Multitalk· AI Avatar Video Generator
13:571:32Simulate angry expressions with Multi-TalkThe demo shows how to use an angry reference image and matching audio to generate a highly expressive video that captures the pitch and intensity of the speaker's anger.Multitalk· AI Lip Sync Generator
15:291:10Animate sad emotions and cryingThe creator demonstrates Multi-Talk's ability to handle complex emotions by animating a sad character who pauses and breathes in sync with a crying audio track.Multitalk· AI Lip Sync Generator
17:441:28Lip-syncing anime charactersA demonstration of applying Japanese audio to an anime still image, showing how the tool handles non-human characters and different languages.Multitalk· AI Lip Sync Generator
19:393:03Animate multiple speakers in a podcast sceneCurrentThe video shows how to configure Multi-Talk for two speakers by uploading an image of two people and two sequential audio clips, assigning voices based on their position in the frame.Multitalk· AI Avatar Video Generator
22:173:21Parallel multi-speaker animationThe user demonstrates a more advanced multi-speaker setup where two audio tracks are played in parallel to animate a conversation between two people in a single reference image.Multitalk· AI Avatar Video Generator
26:322:34Transfer human motion with VACE and Multi-TalkThe demo shows how to use a control video of a person dancing to drive the body movements of a reference image while simultaneously applying a Spanish lip-sync track.Multitalk· Video to Video- Watch “Make AI videos with talking + pose + reference control. MultiTalk & VACE tutorial” →
AI Avatar Video Generator
8:224:37Generate talking head video from image and audioThe user demonstrates uploading a reference image and an audio clip to Multi-Talk, configuring background removal and text prompts to generate a video of a woman speaking in a park.AI Search
19:393:03Animate multiple speakers in a podcast sceneCurrentThe video shows how to configure Multi-Talk for two speakers by uploading an image of two people and two sequential audio clips, assigning voices based on their position in the frame.AI Search
22:173:21Parallel multi-speaker animationThe user demonstrates a more advanced multi-speaker setup where two audio tracks are played in parallel to animate a conversation between two people in a single reference image.AI Search
0:500:47Multi-person conversational video generationMultiTalk is shown animating a group image where two separate people interact and respond to each other using different audio tracks.NadimExplainsAI
Multitalk