Microsoft VibeVoice: Open-Source Frontier Voice AI

35 points by tosh an hour ago

Great post last night from Simon: https://simonwillison.net/2026/Apr/27/vibevoice/

542458 6 minutes ago
Note that this just covers the Speech-to-Text/Speech-Recognition aspect (a-la whisper), there's also models for long-form Text-To-Speech and steaming Text-To-Speech.

Isn't this project the one Microsoft published but then soon after pulled it for security/safety reasons? What has changed since then?

542458 4 minutes ago
Look at the "News" section in the readme - The original TTS model is gone from this repo (you can still find it other places), but the SST/ASR, long form TTS, and streaming TTS models are new.

podgietaru 15 minutes ago

So we've really just settled on Vibe as the verb for AI then?

pryanshu89 9 minutes ago
Why use precise technical language when you can just vibe with your AI system?
giarc 6 minutes ago
I'd be willing to bet it will be "Word of the Year" for 2026. Merriam-Webster had 'slop' for 2025, and 'polarization' for 2024. Is there a prediction market for this?

walthamstow 6 minutes ago

Seems quite heavy for a STT model, Parakeet and Whisper are much smaller and perform great for quick dictation and transcription of longer files. I guess that's due to additional accuracy and speaker diarisation?

The TTS example clip in the repo of 'spontaneous singing' is creepy as fuck