Remix.run Logo
joegibbs 4 days ago

It would be difficult to do with a raw model, but a two-step method in a chat interface would work - first the model suggests the URLs, tool call to fetch them and return the actual text of the pages, then the response can be based on that.

mh- 4 days ago | parent [-]

I prototyped this a couple months ago using OpenAI APIs with structured output.

I had it consume a "deep thought" style output (where it provides inline citations with claims), and then convert that to a series of assertions and a pointer to a link that supposedly supports the assertion. I also split out a global "context" (the original meaning) paragraph to provide anything that would help the next agents understand what they're verifying.

Then I fanned this out to separate (LLM) contexts and each agent verified only one assertion::source pair, with only those things + the global context and some instructions I tuned via testing. It returned a yes/no/it's complicated for each one.

Then I collated all these back in and enriched the original report with challenges from the non-yes agent responses.

That's as far as I took it. It only took a couple hours to build and it seemed to work pretty well.