| ▲ | zo1 3 days ago | |
If you think these things are just using a "dumb" search query, and using the top 5 hits, you're in for a lot of surprises very soon. | ||
| ▲ | andai 3 days ago | parent [-] | |
Well, considering TFA, it would be pretty strange if I did! My point was it's silly to rely on a slow, expensive, unreliable system to do things you can do quickly and reliably with ten lines of Python. I saw this in the Auto-GPT days. They tried to make GPT-4 (the non-agentic one with the 8k context window) use tool calls to do a bunch of tasks. And it kept getting confused and forgetting to do stuff. Whereas if you just had for page in pages: summarize(page) it works 100% of the time, can be parallelized etc. And of course the best part is that the LLM itself can write that code, i.e. it already has the power to make up for its own weaknesses, and make (parts of itself) run deterministically. --- On that note, do you know more about the environment they ran this thing in? I got API access (it's free on OpenRouter), but I'm not sure what to plug this into. OpenRouter provides a search tool, but the paper mentions intelligent context compression and all sorts of things. | ||