Remix.run Logo
psadri 5 days ago

I do miss the earlier "heavy" models that had encyclopedic knowledge vs the new "lighter" models that rely on web search. Relying on web search surfaces a shallow layer of knowledge (thanks to SEO and all the other challenges of ranking web results) vs having ingested / memorized basically the entirety of human written knowledge beyond what's typically reachable within the first 10 results of a web search (eg: digitized offline libraries).

hamdingers 4 days ago | parent | next [-]

I feel the opposite. Before I can use information from a model's "internal" knowledge I have to engage in independent research to verify that it's not a hallucination.

Having an LLM generate search strings and then summarize the results does that research up front and automatically, I need only click the sources to verify. Kagi Assistant does this really well.

beefnugs 4 days ago | parent [-]

So does anyone have any good examples of it effectively avoiding the blogspam and SEO? Or being fooled by it? How often either way?

coffeefirst 4 days ago | parent | next [-]

Bulk search is the only thing where I’ve been consistently impressed with LLMs.

But, like the parent, I’m using the Kagi assistant.

So the answer here might be “search for 5 things and pull the relevant results” works incredibly well, but first you have to build an extremely good search engine that lets the user filter out spam sites.

That said, this isn’t magic, it’s just automated an hour of googling. If the content doesn’t exist you won’t find it.

15123123aa 4 days ago | parent | prev | next [-]

I find one thing it doesn't do very well is avoiding marketing articles pushed by a brand itself. e.g. if I search is X better than Y, very likely landing on articles by makers of brand X and Y and not a 3rd party reviewer. When I manually search on Google I can spot marketing articles just by the URL.

simonw 4 days ago | parent [-]

Have you tried that with GPT-5 Thinking or is this based on your experience with older versions of ChatGPT + search?

simonw 4 days ago | parent | prev [-]

Here's a good article about Google AI mode usually managing to spot and avoid social media misinformation but occasionally falling for it: https://open.substack.com/pub/mikecaulfield/p/is-the-llm-res...

mastercheif 4 days ago | parent | prev | next [-]

I kept search off for a long time due to it tanking the quality of the responses from ChatGPT.

I recently added the following to my custom instructions to get the best of both worlds:

# Modes

When the user enters the following strings you should follow the following mode instructions:

1. "xz": Use the web tool as needed when developing your answer.

2. "xx": Exclusively use your own knowledge instead of searching the internet.

By default use mode "xz". The user can switch between modes during a chat session. Stay with the current mode until the user explicitly switches modes.

ants_everywhere 4 days ago | parent | prev | next [-]

Most real knowledge is stored outside the head, so intelligent agents can't rely solely on what they've remembered. That's why libraries are so fundamental to universities.

stephen_cagle 4 days ago | parent | prev | next [-]

I think this is partially something I have felt myself as well. It would be interesting if these lighter web search models would highlight the distinction between information that has been seen elsehwere vs information that is novel for each page? Like, a view that lets me look at the things that have been asserted and see how many of the different pages show those facts asserted (vs unmentioned vs contradicted).

simianwords 4 days ago | parent | prev | next [-]

There is a tradeoff here: the non search models are internally heavy but the search models are light but also depend on real data.

I keep switching between both but I think I'm starting to prefer the lighter one that is based on the sources instead.

killerstorm 4 days ago | parent | prev | next [-]

These models are still available: GPT-4.5, Gemini 2.5 Pro (at least the initial version - not sure if they optimized it away).

From what I can tell, they are pretty damn big.

Grok 4 is quite large too.

gerdesj 4 days ago | parent | prev [-]

"encyclopedic knowledge"

Have you just hallucinated that?