In this week’s vBrownBag, Principal Software Engineer Dominik Wosiński takes us on a deep dive into Amazon Nova Sonic — AWS’s latest speech-to-speech AI model.
Dominik explores how unified voice models like Nova Sonic are reshaping customer experience, DevOps workflows, and real-time AI interaction, with live demos showing just how natural machine-generated speech can sound.
We cover what makes speech-to-speech difficult, how latency and turn-detection affect conversational design, and why this technology marks the next frontier for AI-driven customer support.
Stick around for audience Q&A, live experiments, and insights on where AWS Bedrock and generative AI are headed next.
👉 Subscribe to vBrownBag for more weekly deep-dives with community experts and learn something new in cloud, DevOps, and AI every episode!
⸻
Timestamps
- 00:00 – Intro & catching up with Eric Wright
 - 05:00 – Meet Dominik Wosiński & Halo Radius
 - 06:15 – Why speech-to-speech matters
 - 10:00 – Challenges with chatbots and customer experience
 - 15:00 – Latency, realism, and the human connection
 - 20:00 – Evolution of synthetic voices (“Jennifer” demo)
 - 25:00 – Why speech is hard for computers
 - 30:00 – Architecture of speech-to-speech systems
 - 40:00 – Inside Amazon Nova Sonic
 - 48:00 – AWS Bedrock integration & limitations
 - 52:00 – Pricing, tokens, and performance
 - 55:00 – Lessons learned from real customer projects
 - 57:00 – Live demo of Nova Sonic in action
 - 01:04:00 – Q&A and closing thoughts
 
How to find Dom: