In this week’s vBrownBag, Principal Software Engineer Dominik Wosiński takes us on a deep dive into Amazon Nova Sonic — AWS’s latest speech-to-speech AI model.
Dominik explores how unified voice models like Nova Sonic are reshaping customer experience, DevOps workflows, and real-time AI interaction, with live demos showing just how natural machine-generated speech can sound.
We cover what makes speech-to-speech difficult, how latency and turn-detection affect conversational design, and why this technology marks the next frontier for AI-driven customer support.
Stick around for audience Q&A, live experiments, and insights on where AWS Bedrock and generative AI are headed next.
👉 Subscribe to vBrownBag for more weekly deep-dives with community experts and learn something new in cloud, DevOps, and AI every episode!
⸻
Timestamps
- 00:00 – Intro & catching up with Eric Wright
- 05:00 – Meet Dominik Wosiński & Halo Radius
- 06:15 – Why speech-to-speech matters
- 10:00 – Challenges with chatbots and customer experience
- 15:00 – Latency, realism, and the human connection
- 20:00 – Evolution of synthetic voices (“Jennifer” demo)
- 25:00 – Why speech is hard for computers
- 30:00 – Architecture of speech-to-speech systems
- 40:00 – Inside Amazon Nova Sonic
- 48:00 – AWS Bedrock integration & limitations
- 52:00 – Pricing, tokens, and performance
- 55:00 – Lessons learned from real customer projects
- 57:00 – Live demo of Nova Sonic in action
- 01:04:00 – Q&A and closing thoughts
How to find Dom: