Remix.run Logo
cxr a day ago

It's funny that we're getting so much attention funneled towards the thought-to-machine I/O problem now that LLMs are on the scene.

If the improvements are beneficial now, then surely they were beneficial before.

Prior to LLMs, though, we could have been making judicious use of simple algorithmic approaches to process natural language constructs as command language. We didn't see a lot of interest in it.

lolinder 17 hours ago | parent | next [-]

> Prior to LLMs, though, we could have been making judicious use of simple algorithmic approaches to process natural language constructs as command language. We didn't see a lot of interest in it.

Siri was released in 2011, and Alexa and Google Assistant followed soon thereafter. Companies spent tens of millions of dollars improving their algorithmic NLP because voice interfaces were "the future". I took a class in the late 2010s that went over all of the methodologies that they used for intent parsing and slot filling. All of that has been largely abandoned at this point in favor of LLMs for everything.

My hope is that at some point people will come back to these UI paradigms as we realize the limitations of "everything is a chat bot". There's a simplicity to the context-free limited voice assistants that had a set of specific use cases they could handle, and the effort to chatbot everything is starting to destroy the legitimate use cases that came out of that era like timers and reminders.

TeMPOraL 16 hours ago | parent [-]

I have a somewhat different perspective. The way I see it, for the past 10+ years, the major vendors were going out of their way to try for generic NLP interface. At that point, it's already been known that controlled language[0] + environmental context could allow for highly functional voice control. But for some reason[1], the vendors really wanted for assistants to guess what people mean. As a result, we got 10+ years of shitty assistants that couldn't reliably do anything, not even set a goddamn timer, and weren't able to do much either - it's hard to have many complex features when you can't get the few simplest ones right.

This was a bad direction then. Now, for better or worse, all those vendors got their miracle: LLMs are literally plug-and-play boxes that implement the "parse arbitrary natural-language queries and map them to system capabilities" functionality. Thanks to LLMs, voice interfaces could actually start working. If vendors could also get the "having useful functionality" part right.

(Note: this is distinct from "everything is a chat bot". That's a bad idea simply because typing text sucks, specifically typing out your thoughts in prose form is about the least efficient way to interact with a tool. Voice interfaces are an exception here.)

--

[0] - https://en.wikipedia.org/wiki/Controlled_natural_language

[1] - Perhaps this weird idea that controlled languages are too hard for general population, too much like programming, or such. They're not. More generally, we've always had to "meet in the middle" with our machines, and it was - and remains - always a highly successful approach.

samtheprogram a day ago | parent | prev | next [-]

Uh, we did…? Alexa, Siri, Ok Google…

A lot of money was poured into that goal, but because every type of action required a handcrafted integration, they were either costly to develop or extremely limited. That’s no longer the case.

cxr 3 hours ago | parent [-]

> Alexa, Siri, Ok Google

Complex digital assistants aiming to be do-everything secretaries are not what I had in mind when I said "simple algorithmic approaches".

That aside, which of those were attempts to improve input to a computer like the project submitted here? Everything you listed was most focused on (a) trying to establish voice as a valid input method (b) to create a new class of applications (c) for more-or-less locked down devices. (The one assistant that's closest to what I'm referring to—but still misses the mark—is the one you didn't mention: Cortana.)

> because every type of action required a handcrafted integration, they were either costly to develop or extremely limited

That describes all conventional software—think of everything you do on your computer. How many lines of code across how many different software packages, each handcrafted, are on your computer? And how narrow versus broad and featureful is each one (calc.exe, for example)? "Do one thing and do it well" is an entire, night regarded philosophical outlook about how to make great software.

regularfry 16 hours ago | parent | prev | next [-]

COBOL and SQL would like a word.

throwaway290 17 hours ago | parent | prev [-]

People have some solution so they are searching for problems it can fit. Doesn't mean it's the best one...