Remix.run Logo
TeMPOraL 16 hours ago

I have a somewhat different perspective. The way I see it, for the past 10+ years, the major vendors were going out of their way to try for generic NLP interface. At that point, it's already been known that controlled language[0] + environmental context could allow for highly functional voice control. But for some reason[1], the vendors really wanted for assistants to guess what people mean. As a result, we got 10+ years of shitty assistants that couldn't reliably do anything, not even set a goddamn timer, and weren't able to do much either - it's hard to have many complex features when you can't get the few simplest ones right.

This was a bad direction then. Now, for better or worse, all those vendors got their miracle: LLMs are literally plug-and-play boxes that implement the "parse arbitrary natural-language queries and map them to system capabilities" functionality. Thanks to LLMs, voice interfaces could actually start working. If vendors could also get the "having useful functionality" part right.

(Note: this is distinct from "everything is a chat bot". That's a bad idea simply because typing text sucks, specifically typing out your thoughts in prose form is about the least efficient way to interact with a tool. Voice interfaces are an exception here.)

--

[0] - https://en.wikipedia.org/wiki/Controlled_natural_language

[1] - Perhaps this weird idea that controlled languages are too hard for general population, too much like programming, or such. They're not. More generally, we've always had to "meet in the middle" with our machines, and it was - and remains - always a highly successful approach.