Don't be confused if it says "no microphone", the moment you click the record button it will request browser permission and then start working.

I spoke fast and dropped in some jargon and it got it all right - I said this and it transcribed it exactly right, WebAssembly spelling included:

> Can you tell me about RSS and Atom and the role of CSP headers in browser security, especially if you're using WebAssembly?

▲

skykooler 3 minutes ago | parent | next [-]

Doesn't seem to work for me - tried in both Firefox and Chromium and I can see the waveform when I talk but the transcription just shows "Awaiting audio input".

▲

Oras 3 hours ago | parent | prev | next [-]

Thank you for the link! Their playground in Mistral does not have a microphone. it just uploads files, which does not demonstrate the speed and accuracy, but the link you shared does.

I tried speaking in 2 languages at once, and it picked it up correctly. Truly impressive for real-time.

	▲	druskacik an hour ago \| parent [-]
		According to the announcement blog Le Chat is powered by the new model as well: https://chat.mistral.ai/chat

▲

tekacs 3 hours ago | parent | prev | next [-]

Having built with and tried every voice model over the last three years, real time and non-real time... this is off the charts compared to anything I've seen before.

And open weight too! So grateful for this.

▲

daemonologist 3 hours ago | parent | prev | next [-]

404 on https://mistralai-voxtral-mini-realtime.hf.space/gradio_api/... for me (which shows up in the UI as a little red error in the top right).

▲

jaggederest 2 hours ago | parent | prev | next [-]

It can transcribe Eminem's Rap God fast sequence, really, really impressive.

	▲	rafram 2 hours ago \| parent \| next [-]
		That's almost certainly in the training data, to be fair.
	▲	keeganpoppen 38 minutes ago \| parent \| prev [-]
		what a great test hahah

▲

pyprism 2 hours ago | parent | prev | next [-]

Wow, that’s weird. I tried Bengali, but the text transcribed into Hindi!I know there are some similar words in these languages, but I used pure Bengali that is not similar to Hindi.

▲

derefr 2 hours ago | parent [-]

Well, on the linked page, it mentions "strong transcription performance in 13 languages, including [...] Hindi" but with no mention of Bengali. It probably doesn't know a lick of Bengali, and is just trying to snap your words into the closest language it does know.

	▲	keeganpoppen 37 minutes ago \| parent [-]
		it must have some exposure to bengali— just not enough for them to advertise it. otherwise it would have a damn hard time.

▲

carbocation an hour ago | parent | prev | next [-]

This model was able to transcribe Bad Bunny lyrics over the sound of the background music, played casually from my speakers. Impressive, to me.

▲

sheepscreek an hour ago | parent | prev | next [-]

I’ve been using AquaVoice for real-time transcription for a while now, and it has become a core part of my workflow. It gets everything, jargon, capitalization, everything. Now I’m looking forward to doing that with 100% local inference!

▲

3 hours ago | parent | prev | next [-]

[deleted]

▲

rafram 2 hours ago | parent | prev | next [-]

Not terrible. It missed or mixed up a lot of words when I was speaking quickly (and not enunciating very well), but it does well with normal-paced speech.

▲

th0ma5 3 hours ago | parent | prev | next [-]

[dead]

▲

adarsh2321 2 hours ago | parent | prev | next [-]

[flagged]

▲

adarsh2321 2 hours ago | parent | prev [-]

[flagged]