Remix.run Logo
visarga 6 hours ago

Interesting, a computer use environment. I made a CUA benchmark too, 200 web tasks with internal code based evaluation. You can integrate them if you want.

https://github.com/UiPath/uipath_enterprise_benchmark

https://arxiv.org/abs/2511.17131

frabonacci 5 hours ago | parent [-]

Hey visarga - I'm the founder of Cua, we might have met at the CUA ICML workshop? The OS-agnostic VNC approach of your benchmark is smart and would make integration easy. We're open to collaborating - want to shoot me an email at f@trycua.com?