Remix.run Logo
k__ 8 hours ago

So, this basically ensures that models call the right tools with the correct format?

zambelli 8 hours ago | parent [-]

In a nutshell, yes. It tries to anyways, but at the end of the day, some models get stuck and you hit a max iterations error that forge will raise, with some context, and the consumer can choose what it wants to do at that point.

k__ 8 hours ago | parent [-]

Ah, so it a "smart" retry mechanism?

zambelli 8 hours ago | parent [-]

I'd like to think so! ;). It has some brains, but the key insight was to send the model domain-agnostic nudges. I don't need to know what you're trying to do, the LLM already knows, I just need to nudge it back on the structural side: text response vs tool call, arg mismatch, etc. and let its knowledge of the context fill in the blanks (otherwise I'd need a massive library of every possible failure mode).

The other insight was doing it at tool call level and not workflow level, which addresses the compounding math problem more directly.

jimmySixDOF 7 hours ago | parent [-]

Maybe similar to Instructor [1] which was a cool tool for json and structured output enforcement combining pydandic with ai retry loops very handy for when models don't have that covered

[1] https://github.com/567-labs/instructor

zambelli 6 hours ago | parent [-]

Interesting! I'll look into that. Would mean another dep/integration but might be more robust.