Remix.run Logo
arjunchint 5 hours ago

Hey cool stuff since last update!

I still don't buy the we needed it to be a whole Browser and not a Chrome Extension argument:

- your interface is still literally a chrome extension side panel

- none of the agentic browsers from the bigger players like Atlas and Comet really took off either

I do think the server side integration is required:

- with rtrvr.ai a ton of users are integrating our web agent chrome extension via Remote MCP from chatgpt.com as well as triggering as an API endpoint remotely. Your implementation is limited to only local connections as I understand.

- the biggest unlock for users is running at scale, so just being able to launch a hundred cloud browsers, do a task, and return results while you do other things. So we see hybrid cloud/local execution as the key unlock for this year

Your workflow pipeline is really cool! Any blog post/summary on how you set it up?

Last year was a lot of technical builders exploring the capabilities, and I am excited for this year of making these agentic browsers useful!

felarof 4 hours ago | parent | next [-]

Thanks!

> whole Browser and not a Chrome Extension argument

Both of us are definitely biased to think our own approach is better :)

But without owning the binary, we couldn't shipped today's feature -- Agent with access to your filesystem and being able to run shell commands like Claude Cowork.

> your interface is still literally a chrome extension side panel

Yep, our interface is a chrome extension to make iterating on the UX faster. But it uses a ton of C++ APIs that we expose under `chrome.browseros.*`

> Your workflow pipeline is really cool! Any blog post/summary on how you set it up?

Thanks! We'll look into publishing a blog soon!

arjunchint 3 hours ago | parent [-]

> But without owning the binary, we couldn't shipped today's feature -- Agent with access to your filesystem and being able to run shell commands like Claude Cowork

Chrome Extension can also access local files and can also execute LLM generated code in sandboxes

johnsmith1840 3 hours ago | parent | prev [-]

Extensions are limited though.

One simple example is an extension can't see cross origin iframes. This means it could never do soemthing like fill out a payment form for you if it's an extension.

Limited computation and action space is another as well as bot detection systems.

For example a javascript method trying to automate something like microsoft word in an iframe will have a tough time because the second you inject code in there they will block you.

arjunchint 3 hours ago | parent [-]

> One simple example is an extension can't see cross origin iframes

Sounds like a skill issue, our web agent is able to interact with cross origin iframes to for example solve captchas: https://www.youtube.com/watch?v=LD3afouKPYc

We honestly haven't faced any bot detection or blocking issues. Owning the browser layer exposes to you much more detection just look at Comet getting blocked on Amazon etc.

johnsmith1840 3 hours ago | parent | next [-]

With specific user permission to do so sure but in general it is blocked.

johnsmith1840 3 hours ago | parent | prev [-]

You're still limited in lots of annoying ways though

quarkcarbon279 2 hours ago | parent [-]

what permissions are you talking about? No user permissions/any insecure permissions are needed to navigate cross origin iframes, shadow DOMs and likewise. It comes down to your architecture choices and capabilities - rtrvr can navigate these diff realms without ever taking debugger or such insecure permissions