| ▲ | franze 4 hours ago | |
Congrats for shipping. How does it compare to Agent Browser by Vercel? | ||
| ▲ | antves 3 hours ago | parent [-] | |
Thanks for asking! There are a few core differences: 1. we expose a higher level interface which allows the agent to think about what to do as opposed to what to do 2. we developed a token-efficient representation of the webpages that combines both visual and textual elements, heavily optimized for what LLMs are good at. 3. because we control the agentic loop, it also means that we can do fancy things on contextual injections, compressions, asynchronous manipulations, etc which are impossible to achieve when exposing the navigation interface 4. we use a coding agent under the hood, meaning that it can express complex actions efficiently and effectively compared to the CLI interface that agent-browser exposes 5. because we control the agent, we can use small and efficient LLMs which make the system much faster, cheaper, and more reliable Also, our service comes with batteries included: the agent can use browsers in our cloud with auto-captcha solvers, stealth mode, we can proxy your own ip, etc | ||