Remix.run Logo
raw_anon_1111 3 hours ago

The problem was that Siri didn’t need new technology to be at least as good as Alexa, just more monkeys at the keyboard.

Classic Alexa, Gemini and Siri are all just intent based pattern matching systems where you brute force all of the phrases you want to match on (utterances), map those to intents and have “slots” for the variable parts. Like where you are coming from and where you are going.

Then you trigger an API. I’ve worked with the underlying technology behind Alexa for years on AWS - Amazon Lex - with call centers (Amazon Connect).

On the other hand, the capabilities and reliability of both Alexa and Google’s voice assistant have regressed once they moved to an LLM based system. It seems to be a hard problem and I don’t understand why.

I’ve find plenty of free text input -> LLM -> standard JSON output -> call API implementations. It seems like it would just be another LLM + brute force issue.