Remix.run Logo
oldsecondhand 4 days ago

The most useful feature of LLMs is giving sources (with URL preferably). It can cut through a lot of SEO crap, and you still get to factcheck just like with a Google search.

sefrost 4 days ago | parent | next [-]

I like using LLMs and I have found they are incredibly useful writing and reviewing code at work.

However, when I want sources for things, I often find they link to pages that don't fully (or at all) back up the claims made. Sometimes other websites do, but the sources given to me by the LLM often don't. They might be about the same topic that I'm discussing, but they don't seem to always validate the claims.

If they could crack that problem it would be a major major win for me.

joegibbs 4 days ago | parent [-]

It would be difficult to do with a raw model, but a two-step method in a chat interface would work - first the model suggests the URLs, tool call to fetch them and return the actual text of the pages, then the response can be based on that.

mh- 4 days ago | parent [-]

I prototyped this a couple months ago using OpenAI APIs with structured output.

I had it consume a "deep thought" style output (where it provides inline citations with claims), and then convert that to a series of assertions and a pointer to a link that supposedly supports the assertion. I also split out a global "context" (the original meaning) paragraph to provide anything that would help the next agents understand what they're verifying.

Then I fanned this out to separate (LLM) contexts and each agent verified only one assertion::source pair, with only those things + the global context and some instructions I tuned via testing. It returned a yes/no/it's complicated for each one.

Then I collated all these back in and enriched the original report with challenges from the non-yes agent responses.

That's as far as I took it. It only took a couple hours to build and it seemed to work pretty well.

IgorPartola 4 days ago | parent | prev [-]

From what I have seen, a lot of what it does is read articles also written by AI or forum posts with all the good and bad that comes with that.