Remix.run Logo
airstrike 2 hours ago

Computer use is such a terrible idea. It's slow, insecure, error prone, expensive.

I guess if you're trying to get people to tokenmaxx it may look like a valid strategy, but ain't no way this will be delightful to users.

I think it's a symptom of just not understanding how LLMs should interface with the OS because we're still in their early days.

Eventually there'll be an iPhone moment for the ergonomics of LLM usage outside of coding

gdudeman 40 minutes ago | parent | next [-]

Computer use is a great idea. It gets the job done when nothing else will.

If you're a person trying to get their job done at a big company, but half your job is in 1-2 proprietary tools or is stuck behind an API you can't program against, computer use can allow you, a non-techie, to do your job more efficiently.

I think it's an awesome way to circumvent gate keepers and the IT department to let people accomplish their goals.

uejfiweun 15 minutes ago | parent [-]

Yeah, it's not that computer use is the most theoretically optimal paradigm, but there's a reasonable case that given the constraints of modern software systems and how they're built, that it's the most realistically optimal paradigm.

thorum an hour ago | parent | prev | next [-]

The “correct”, elegant way for AI to interact with existing software would take decades and billions of dollars to build. Someone would have to do the hard work of building new APIs, solving decades of accessibility issues, etc.

Or you can show an AI screenshots and ask it where to click.

jubilanti 5 minutes ago | parent | next [-]

it takes decades and billions of dollars to develop APIs?

sarreph an hour ago | parent | prev [-]

I disagree if your application is networked. Most SaaS is built on RESTful APIs that can be converted trivially into interfaces / contracts for tool use.

chatmasta 24 minutes ago | parent [-]

So you can either wait for every application to do that, or at least make it possible for an LLM to do it… or you can make the LLM use a computer interface that works with every application by definition.

orbital-decay 28 minutes ago | parent | prev | next [-]

Spreadsheet is such a terrible idea. It may look like a valid tool, but ain't no way it's delightful to users. Most of the time people need a database instead. Eventually there'll be an iPhone moment for this.

Meanwhile, the entire world economy:

nzach an hour ago | parent | prev | next [-]

> Computer use is such a terrible idea. It's slow, insecure, error prone, expensive.

And yet having an agent able yo use a computer on your behalf is really useful.

Recently I gave a Nix OS vm to my hermes agent and it has been a good experience. I don't really care if destroy the machine I can just rollback to an earlier version, and for any meaningful data he creates for me I make sure he creates a repo, commit and pushes to my private Gitea instance.

dbbk 31 minutes ago | parent | next [-]

> And yet having an agent able yo use a computer on your behalf is really useful.

I honestly cannot think of a single use case

airstrike an hour ago | parent | prev [-]

> And yet having an agent able yo use a computer on your behalf is really useful.

It is, but there's no need for it to be viewing your screen, browsing websites and watching ads.

That stuff is for humans, not for LLMs.

nzach 37 minutes ago | parent [-]

Sure, I don't want an agent watching MY screen. That's why I gave him his own environment, and pretty quickly he discovered that you can open chrome and make it render to a framebuffer, this way he is able to 'view' the website. And apparently with this he is able to bypass a lot of 'anti-bot' measures.

api an hour ago | parent | prev [-]

It's great for testing and QA automation for UIs. It's also possibly good for the vision impaired.

orbital-decay 9 minutes ago | parent [-]

UI QA only works well if your model plausibly matches the average user behavior and/or real-world edge cases. These models are far from that, and they are much less random than you'd like them to be for fuzzing (mode collapse).