A joint audio-video model that accurately follows complex instructions. — Replicate | AllCraft | Allcraft AI