▲ | chhxdjsj 4 days ago | |
Hi, great work congrats! Does it use openrouter for model selection? Which models did you achieve the webarena result with? Are there any open source models which are any good for this? | ||
▲ | tcwd 4 days ago | parent [-] | |
For the WebArena result, we actually used a mixture of models checking each other's work and evaluating in real time. We found the verifications to be really effective in producing accurate results. Feel free to take a look at our architectural blog post to learn more in detail: https://blog.withmeka.com/introducing-meka-an-open-source-fr... Unfortunately, we didn't try it out with open source models, but you are welcome to pull the repo and try with any model that has good visual grounding! (I heard UI-TARS and the latest Qwen visual model are quite good) |