ipotapov 17 minutes ago

I built speech-swift, which focuses on on-device ASR and TTS, similar to Parlor Jarvis's multilingual capabilities, but specifically optimized for Apple Silicon with 52 languages and a real-time factor of 0.06. It also includes speaker diarization and noise suppression. https://github.com/soniqo/speech-swift

unusual_typo 2 hours ago

I shipped an enhanced fork of Parlor (by Fikri Karim https://news.ycombinator.com/item?id=47652007) that reads various visual inputs and uses Supergemma 4 E4B + Supertonic TTS to run a fully local, multimodal, and multilingual AI assistant. It runs entirely on your machine.

What it does:

1. Talk to your screen: It reads and understands your webcam, screen sharing, PDFs, and video at once.

2. Native Multilingual: It can speak five languages: English, Korean, Spanish, Portuguese, and French.