| ▲ | ajuhasz 8 hours ago | |
> Are you okay with private and intimate conversations and moments (including of underage family members) being saved for replaying later? One of our core architecture decisions was to use a streaming speech-to-text model. At any given time about 80ms of actual audio is in memory and about 5 minutes of transcribed audio (text) is in memory (this is help the STT model know the context of the audio for higher transcription accuracy). Of these 5 minute transcripts, those that don't become memories are forgotten. So only selected extracted memories are durably stored. Currently we store the transcript with the memory (this was a request from our prototype users to help them build confidence in the transcription accuracy) but we'll continue to iterate based on feedback if this is the correct decision. | ||