Multimodal Video Synthesis Techniques
https://www.livebinders.com/b/3706694?tabid=1e2968d6-69ec-3ed4-ed23-769761df7701
When I first connected a camera to a text encoder alongside a small audio module, I discovered that multimodal video systems are not simply about one input complementing another