XCAT 3.0: Multi-person conversational video generation

Demo summary
MultiTalk is shown animating a group image where two separate people interact and respond to each other using different audio tracks.
Step-by-step
- Select a group image containing multiple people
- Upload two separate voice clips for the different speakers
- Generate the video to synchronize lip-sync and expressions for both individuals
Tips
- Use contrasting audio tracks, such as one person giving advice and another responding, to create a natural conversational flow
Highlights
“It gets even better. Select a group image. Upload two separate voice clips... and watch as multi-talk generates a real conversation”
AI Avatar Video Generator
8:224:37Generate talking head video from image and audioThe user demonstrates uploading a reference image and an audio clip to Multi-Talk, configuring background removal and text prompts to generate a video of a woman speaking in a park.AI Search
19:393:03Animate multiple speakers in a podcast sceneThe video shows how to configure Multi-Talk for two speakers by uploading an image of two people and two sequential audio clips, assigning voices based on their position in the frame.AI Search
22:173:21Parallel multi-speaker animationThe user demonstrates a more advanced multi-speaker setup where two audio tracks are played in parallel to animate a conversation between two people in a single reference image.AI Search
0:500:47Multi-person conversational video generationCurrentMultiTalk is shown animating a group image where two separate people interact and respond to each other using different audio tracks.NadimExplainsAI
XCAT 3.0