| ▲ | Show HN: Parlor Jarvis – Realtime AI (audio+screen in, voice out) & multilingual(github.com) | |
| 8 points by unusual_typo 10 hours ago | 2 comments | ||
| ▲ | ipotapov 7 hours ago | parent | next [-] | |
I built speech-swift, which focuses on on-device ASR and TTS, similar to Parlor Jarvis's multilingual capabilities, but specifically optimized for Apple Silicon with 52 languages and a real-time factor of 0.06. It also includes speaker diarization and noise suppression. https://github.com/soniqo/speech-swift | ||
| ▲ | unusual_typo 10 hours ago | parent | prev [-] | |
I shipped an enhanced fork of Parlor (by Fikri Karim https://news.ycombinator.com/item?id=47652007) that reads various visual inputs and uses Supergemma 4 E4B + Supertonic TTS to run a fully local, multimodal, and multilingual AI assistant. It runs entirely on your machine. What it does: 1. Talk to your screen: It reads and understands your webcam, screen sharing, PDFs, and video at once. 2. Native Multilingual: It can speak five languages: English, Korean, Spanish, Portuguese, and French. | ||