Remix.run Logo
m00dy 4 days ago

May I ask your internal benchmark ? I'm building a new set of benchmarks and testing suite for agentic workflows using deepwalker [0]. How do you design your benchmark suite ? would be really cool if you can give more details.

[0] https://deepwalker.xyz

thecupisblue 4 days ago | parent [-]

Shared a bit more here - https://news.ycombinator.com/item?id=46314047.

But pretty rudimentary, nothing special. Also did not know about deepwalker, looks quite interesting - you building it?

m00dy 3 days ago | parent [-]

I personally know the team who builds the product.