Remix.run Logo
lolinder 18 hours ago

> Prior to LLMs, though, we could have been making judicious use of simple algorithmic approaches to process natural language constructs as command language. We didn't see a lot of interest in it.

Siri was released in 2011, and Alexa and Google Assistant followed soon thereafter. Companies spent tens of millions of dollars improving their algorithmic NLP because voice interfaces were "the future". I took a class in the late 2010s that went over all of the methodologies that they used for intent parsing and slot filling. All of that has been largely abandoned at this point in favor of LLMs for everything.

My hope is that at some point people will come back to these UI paradigms as we realize the limitations of "everything is a chat bot". There's a simplicity to the context-free limited voice assistants that had a set of specific use cases they could handle, and the effort to chatbot everything is starting to destroy the legitimate use cases that came out of that era like timers and reminders.

TeMPOraL 17 hours ago | parent [-]

I have a somewhat different perspective. The way I see it, for the past 10+ years, the major vendors were going out of their way to try for generic NLP interface. At that point, it's already been known that controlled language[0] + environmental context could allow for highly functional voice control. But for some reason[1], the vendors really wanted for assistants to guess what people mean. As a result, we got 10+ years of shitty assistants that couldn't reliably do anything, not even set a goddamn timer, and weren't able to do much either - it's hard to have many complex features when you can't get the few simplest ones right.

This was a bad direction then. Now, for better or worse, all those vendors got their miracle: LLMs are literally plug-and-play boxes that implement the "parse arbitrary natural-language queries and map them to system capabilities" functionality. Thanks to LLMs, voice interfaces could actually start working. If vendors could also get the "having useful functionality" part right.

(Note: this is distinct from "everything is a chat bot". That's a bad idea simply because typing text sucks, specifically typing out your thoughts in prose form is about the least efficient way to interact with a tool. Voice interfaces are an exception here.)

--

[0] - https://en.wikipedia.org/wiki/Controlled_natural_language

[1] - Perhaps this weird idea that controlled languages are too hard for general population, too much like programming, or such. They're not. More generally, we've always had to "meet in the middle" with our machines, and it was - and remains - always a highly successful approach.