Remix.run Logo
tiagobraw 4 hours ago

Interesting. Claude Opus 4.8 and Gemini 3.1 Lite kind of got it right, but when I ask the model directly, they say they don't know. I'm curious how the tool is doing the correlation.

turtlesoup 3 hours ago | parent [-]

Prompt for rollouts posted below (https://news.ycombinator.com/item?id=48592415). I have a bit more information on the clustering part in https://intheweights.com/about but every thing returned by the model is viewable (possibly under the "hallucinations" section)