Remix.run Logo
dezmou 4 days ago

I love it, just purchased a pack. I've also found that it is a very great tool to test LLM, like take a screenshot of a half resolved game and feed it to ChatGPT with the rules and ask him to select the next target

tikotus 4 days ago | parent | next [-]

Thank you so much! Also, you might find this interesting regarding testing LLMs: https://www.nicksypteras.com/blog/cbs-benchmark.html

dezmou 4 days ago | parent | prev [-]

turn out Claude Sonnet 4.5 is far better as resolving those as ChatGPT 5.2