Remix.run Logo
Aeroi 10 hours ago

One thing that surprised me is how much code citation data is in most of the models training data already. Where the agents still fall apart is visual analysis like a corroded valve photo with a vague description and they'll confidently cite the wrong API standard. That gap is most of where the 87% delta comes from for us.

Happy to walk through specific cases if anyone wants to dig in.