Remix.run Logo
davideuler 5 hours ago

I created a dashboard for stability of OpenClaw and Hermes. It shows stability score(10 is the most stable) which is calculated by analyzing Github issue by GPT.

Lots of friends asked me which version of OpenClaw/Hermes are recommended as a stable version. I've no clue of it, and I don't updated my OpenClaw/Hermes very often to avoid unstable versions frequently. So I created the Agent Watch dashboard.

https://agentwatch.aicompass.dev/

schipperai 5 hours ago | parent [-]

Very cool. How do you classify negative signals?

davideuler 4 hours ago | parent | next [-]

I've updated several iterations to improve the accuracy for release stability. And I open sourced the project so that you may contribute to the dashboard to make it more useful: https://github.com/davideuler/agent-watch

THANK YOU for all guys who gives feedback for the tiny project.

davideuler 5 hours ago | parent | prev [-]

GPT would analyze each issue if it is negative. And also it would analyze if it the core features related issue. I iterated it several times. The dashboard seems more reasonable than the initial version. I would open source the project soon so that other could contribute to build a better stability dashboard for the daily Agents we use.