| ▲ | Show HN: PageAgent, A GUI agent that lives inside your web app(alibaba.github.io) | ||||||||||||||||||||||||||||||||||||||||||||||
| 49 points by simon_luv_pho 4 hours ago | 28 comments | |||||||||||||||||||||||||||||||||||||||||||||||
Title: Show HN: PageAgent, A GUI agent that lives inside your web app Hi HN, I'm building PageAgent, an open-source (MIT) library that embeds an AI agent directly into your frontend. I built this because I believe there's a massive design space for deploying general agents natively inside the web apps we already use, rather than treating the web merely as a dumb target for isolated bots. Currently, most AI agents operate from external clients or server-side programs, effectively leaving web development out of the AI ecosystem. I'm experimenting with an "inside-out" paradigm instead. By dropping the library into a page, you get a client-side agent that interacts natively with the live DOM tree and inherits the user's active session out of the box, which works perfectly for SPAs. To handle cross-page tasks, I built an optional browser extension that acts as a "bridge". This allows the web-page agent to control the entire browser with explicit user authorization. Instead of a desktop app controlling your browser, your web app is empowered to act as a general agent that can navigate the broader web. I'd love to start a conversation about the viability of this architecture, and what you all think about the future of in-app general agents. Happy to answer any questions! | |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | simon_luv_pho 4 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
This is highly experimental right now, but here are some quick links for anyone wanting to dig deeper: - GitHub: https://github.com/alibaba/page-agent - Live Demo (No sign-up): https://alibaba.github.io/page-agent/ (you can drag the bookmarklet from here to try it on other sites) - Browser Extension: https://chromewebstore.google.com/detail/page-agent-ext/akld... I'd be really interested in feedback on the security model of client-side agents giving extension-bridge access, and taking questions on the implementation! | |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | pscanf 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Very cool! I'm particularly impressed by the bookmark "trick" to install it on a page. Despite having spent 15 years developing for the browser, I had somehow missed that feature of the bookmarks bar. But awesome UX for people to try out the tool. Congrats! | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | general_reveal 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
I’ve been thinking about something like this. If it’s just a one line script import, how the heck are you trusting natural language to translate to commands for an arbitrary ui? The only thing I can think of is you had the AI rewrite and embed selectors on the entire build file and work with that? | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | mentalgear 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
> Data processed via servers in Mainland China Appreciate the transparency, but maybe you could add some European (preferably) alternatives ? | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | dzink 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Is this Affiliated with the Chinese company Alibaba? Any chance data goes there too? | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | MeteorMarc 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Confusing name because of the existence of pageant, the putty agent. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | Mnexium 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Curious - how does it perform with captchas and other "are you human" stuff on the web? | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | coreylane 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Looks cool! Are you open to adding AWS Bedrock or LiteLLM support? | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | popalchemist an hour ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Does it support long-click / click-and-drag? | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | jauntywundrkind 3 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||
Not exactly the same but I'd also point to Paul Kinlan's FolioLM as a very interesting project in this space. A very nice browser extension, > Collect and query content from tabs, bookmarks, and history - your AI research companion. FolioLM helps you collect sources from tabs, bookmarks, and history, then query and transform that content using AI. https://github.com/PaulKinlan/NotebookLM-Chrome https://chromewebstore.google.com/detail/foliolm/eeejhgacmlh... | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||