Remix.run Logo
1bpp 2 days ago

How would this prevent someone from just plugging ElevenLabs into it? Or the inevitable more realistic voice models? Or just a prerecorded spam message? It's already nearly impossible to tell if some speech is human or not. I do like the idea of recovering the emotional information lost in speech -> text, but I don't think it'd help the LLM issue.

SrslyJosh 2 days ago | parent | next [-]

Detecting "human speech" means shutting out people who cannot speak and rely on TTS for verbal communication.

estimator7292 2 days ago | parent [-]

Also speech impediments, accents, physical disabilities, etc etc.

Tech culture just refuses to even be aware of people as physical beings. It's just spherical users in a vacuum and if you don't fit the mold, tough.

layman51 2 days ago | parent | prev | next [-]

Or also a genuine human voice reading a script that’s partially or almost entirely LLM written? I think there must be some video content creators who do that.

siim 2 days ago | parent | prev [-]

True. However making voice input has higher friction than typing chatgpt write me a reply.