Remix.run Logo
tbruckner 4 days ago

Has anyone found these deep research tools useful? In my experience, they generate really bland reports don't go much further than summarization of what a search engine would return.

andy99 4 days ago | parent | next [-]

My experience is the same as yours. It feels to me (similar to most LLM writing) like they write for someone who’s not going to read it or use it but is going to glance at it and judge the quality that way and assume it’s good.

Not to different from a lot of consulting reports, in fact, and pretty much of no value if if you’re actually trying to learn something.

Edit to add: even the name “deep research” to me feels like something defined to appeal to people who have never actually done or consumed research, sort of like the whole “phd level” thing.

tbruckner 4 days ago | parent [-]

"they write for someone who’s not going to read it" is a great way to phrase it.

ainch 4 days ago | parent | prev | next [-]

The reports are definitely bland, but I find them very helpful for discovering sources. For example, if I'm trying to ask an academic question like "has X been done before," sending something to scour the internet and find me examples to dig into is really helpful - especially since LLMs have some base knowledge which can help with finding the right search terms. It's not doing all the thinking, but those kind of broad overviews are quite helpful, especially since they can just run in the background.

kmarc 4 days ago | parent | next [-]

I caught myself that most of my LLM usage is like this:

ask a loaded, "filter question" I more or less know the answer for, and mostly skip the prose and get to the links to its sources.

ukuina 3 days ago | parent [-]

The "loaded question" approach works for getting MUCH better pro/con lists, too, in general, across all LLMs.

vogu66 4 days ago | parent | prev [-]

I do that too, I wonder how much of it is the LLM being helpful and how much of it is the RAG algorithm somehow providing better references to the LLM than a google search can?

remus 3 days ago | parent | prev | next [-]

I run a small website and am based in the UK and have used it a couple of times to summarise what I need to do to comply with different bits of legislation e.g. Online Safety Act. What's really useful for me is that I can feed in a load of context about what the site does and get a response that's very tailored to what's relevant for me, and generate template paperwork that I can then fill out to improve my position with regard to the legislation.

For sure it's probably missing stuff that a well payed lawyer would catch, but for a project with zero budget it's a massive step up over spending hours reading through search results and trying to cobble something together myself.

roryirvine 3 days ago | parent [-]

The hidden cost there is that the risk of complying with the legislation remained entirely with you. Even the best specialist research LLM still might easily have hallucinated or made some other sort of error which resulted in it giving you confusing or incorrect advice - and you would have been the one held liable for following it.

Whereas with real legal advice, your lawyer will carry Professional Indemnity Insurance which will cover any costs incurred if they make a mistake when advising you.

As you say, it's a reasonable trade-off for you to have made when the alternative was sifting through the legislation in your own spare time. But it's not actually worth very much, and you might just as well have used a general model to carry out the same task and the outcome would likely have been much the same.

So it's not particularly clear that the benefits of these niche-specific models or specialised fine-tunes are worth the additional costs.

(Caveat: things might change in the future, especially if advancements in the general models really are beginning to plateau.)

blaesus 4 days ago | parent | prev | next [-]

"Summarization of what a search engine would return" is good enough for many of my purposes though. Good for breaking into new grounds, finding unknown unknowns, brainstorming etc.

andai 3 days ago | parent [-]

I have a script that searches DDG (free), scrapes top 5 results, shoves them into an LLM, and answers your question.

I wrote it back when AI web search was a paid feature and I wanted access to it.

At the time Auto-GPT was popular and using the LLM itself to slowly and unreliably do the research.

So I realized a Python program would be way faster and it would actually be deterministic in terms of doing what you expect.

This experience sort of shaped my attitude about agentic stuff, where it looks like we are still relying too heavily on the LLM and neglecting to mechanize things that could just work perfectly every time.

zo1 3 days ago | parent [-]

If you think these things are just using a "dumb" search query, and using the top 5 hits, you're in for a lot of surprises very soon.

andai 3 days ago | parent [-]

Well, considering TFA, it would be pretty strange if I did!

My point was it's silly to rely on a slow, expensive, unreliable system to do things you can do quickly and reliably with ten lines of Python.

I saw this in the Auto-GPT days. They tried to make GPT-4 (the non-agentic one with the 8k context window) use tool calls to do a bunch of tasks. And it kept getting confused and forgetting to do stuff.

Whereas if you just had

for page in pages: summarize(page)

it works 100% of the time, can be parallelized etc.

And of course the best part is that the LLM itself can write that code, i.e. it already has the power to make up for its own weaknesses, and make (parts of itself) run deterministically.

---

On that note, do you know more about the environment they ran this thing in? I got API access (it's free on OpenRouter), but I'm not sure what to plug this into. OpenRouter provides a search tool, but the paper mentions intelligent context compression and all sorts of things.

criemen 4 days ago | parent | prev | next [-]

I tend to use them when I'm looking to buy something of category X, and want to get a market overview. I can then still dig in and decide whether I consider the sources used trustworthy or not, and before committing money, I'll read some reviews myself, too. Still, it's a speedup for me.

edot 4 days ago | parent | next [-]

Yes, this is one of my primary use cases for deep research right now. It will become garbage in a few short years once OpenAI starts selling influence / ads. I think they’ve started a bit with doing this but the recommendations still seem mostly “correct”. My prior way of doing this was Googling with site:Reddit.com for real reviews and not SEO spam reviewers.

infecto 3 days ago | parent | prev [-]

Same case for me. I find it pretty good at it too. Far from perfect but it usually a pretty darn good start.

TACIXAT 4 days ago | parent | prev | next [-]

I have used Gemini's 2.5 Pro deep research probably about 10 times. I love it. Most recently was reviewing PhD programs in my area then deep diving into faculty research areas.

threecheese 3 days ago | parent | prev | next [-]

Perplexity’s Research tool has basically replaced Google for me, for any search where I don’t already know the answer or know that it’s available somewhere (like documentation).

I use it dozens of times per day, and typically follow up or ask refining questions within the thread if it’s not giving me what I need.

It typically takes between 10sec and 5 minutes, and mostly replicates my manual process - search, review results, another 1..N search passes, review, etc. Initially it rephrases/refines my query, then builds a plan, and this looks a lot like what I might do manually.

infecto 3 days ago | parent | prev | next [-]

I use ChatGPTs quite often. I can send it a loaded question and it helps tease out sources and usually at the very least scrapes away some of the nuance. I have used it a lot for finding a list of a type of products too. Taking the top n search results is already pretty useful for me but I find it typically is a little more in depth than that, going down a few rabbit holes of search depending on the topic. It does not eliminate doing your own research but it helps consolidate some of the initial information.

Then I can further interrogate the information returned with a vanilla LLM.

andai 3 days ago | parent | prev | next [-]

You can copy-paste it into your favorite LLM and ask questions about it. That solves several problems simultaneously.

alasr 4 days ago | parent | prev [-]

I haven't used any LLM deep research tools in the past; today, after reading this HN post, I gave Tongyi DeepResearch a try to see how it performs on a simple "research" task (in an area I've working experience in: healthcare and EHR) and I'm satisfied with its response (for the given tasks; I, obviously, can't say anything how it'll performs on other "research" tasks I'll ask it in the future). I think I'll keep using this model for tasks for which I was using other local LLM models before.

Besides I might give other large deep research models a try when needed.