Agents turn simple keyword search into compelling search experiences

OhMeadhbh 5 hours ago | parent | next [-]

These people really mis-understand how people like me use search. I don't want "an experience," I want a list of documents that match the search specifier.

▲

awlejrlakwejr 3 hours ago | parent | next [-]

Absolutely. We need separate interfaces for work and play.

▲

softwaredoug 5 hours ago | parent | prev [-]

Author here, well specifically I'm using "experiences" to mean basically ranking / finding relevant results.

▲

soco 5 hours ago | parent | next [-]

Wouldn't the "dumb API search" be already enough for users? I'm not sure my decades experience as Google search user has improved once they leaned heavy on the ranking and adding "relevant" things...

	▲	janalsncm 3 hours ago \| parent [-]
		> once they leaned heavy on the ranking Not sure what you mean here. Google started with PageRank which is a decent, albeit gamable, ranking algorithm. They’ve never not been leaning heavy on ranking. Mid 2010s, Google began supporting conversational (non keyword) searches. This is because a good number of queries are difficult to reduce to keywords. But it is an inherently harder problem. And at the same time, the open web enahittified itself, filling up with SEOed blogspam. And a lot of user generated content moved into walled gardens.

▲

nottorp 4 hours ago | parent | prev [-]

Well, at least you didn't use 'engagement'.

Edit: oops, you used it in another comment on here.

▲

dsr_ 5 hours ago | parent | prev | next [-]

"Agents, however, come with the ability to reason."

Citations needed. Calling recursive bullshit reason does not make it so.

▲

softwaredoug 5 hours ago | parent [-]

Well call it what you want, I'm referring to the reasoning functionality of GPT-5, Claude, etc. It seems to work pretty well at this task. These tools prefer grep to some fancy search thingy.

▲

esafak 5 hours ago | parent [-]

> These tools prefer grep to some fancy search thingy.

Passing on the cost to you. The fancy search thingy could navigate the AST and catch typos.

▲

softwaredoug 5 hours ago | parent [-]

It's more accurate to say the agents like simple, interpretable tools. Clear input / obvious output. That might be more an AST thing, keyword search, search with very straightforward filters, etc.

I don't thing they do well with search that is built for human engagement, which is a more complex tool to reason about.

▲

esafak 5 hours ago | parent [-]

It makes no different to the agent. The response from grep is a list of matches, and so it is from something more intelligent. A list is a list.

> I don't thing they do well with search that is built for human engagement, which is a more complex tool to reason about

I agree! Structured outputs are best.

	▲	shadowgovt an hour ago \| parent [-]
		Under the hood, LLMs are vector analysis engines. They benefit from spaces where the result of change is smooth. Adding levels of indirection and secondary reasoning to the search interface makes the results less smooth. This is one of the things humans complain about often when using e.g. Google: "I'm interesting in searching for X, but all these results are Y." Yes, because X and Y are synonyms or close topics and Google is mixing in a popularity signal to deduce that, for example, your search for `tailored swift database` is probably about a corpus of Taylor Swift song lyrics and not about companies that build bespoke Swift APIs on top of data stored in Postgres. If you're driving the process with an LLM, it's more of a problem for the LLM if it's searching the space and the search engine under it keeps tripping over the difference between "swift means a programming language" and "swift means a successful musician" as it explores the result space. A dumber API that doesn't try to guess and just returns both datasets blended together fits the space-search problem better.

▲

daft_pink 34 minutes ago | parent | prev | next [-]

I think Comet's search is really nice and worth the $20 a month, but not $200 a month that it currently costs although it is a little slow. My experience is similar to this article.

▲

mips_avatar 2 hours ago | parent | prev | next [-]

One interesting thing about the lack of click stream feedback is that you can generate it synthetically. If you've got your model evaluating the quality of search responses and adjusting its queries when there's a miss, you get to capture that adjustment and tune your search engine. In user click search you need scale to tune search, now you can generate it. The only problem is you need to trust your agent is doing the right thing as it keeps searching harder.

▲

intalentive 2 hours ago | parent | prev | next [-]

The doc string becoming part of the prompt is a nice touch.

It seems plausible and intuitive that simple tools dynamically called by agents would yield better results than complex search pipelines. But do you have any hard data backing this up?

▲

the_snooze 5 hours ago | parent | prev | next [-]

If agents are making value judgments ostensibly on my behalf, why should I trust them to continue to be aligned with my values if in practice they're almost always running on someone else's hardware and being maintained on someone else's budget?

We'd be stupid to ignore the last 15+ years of big tech "democratization"-to-enshittification bait-and-switch.

	▲	janalsncm 2 hours ago \| parent \| next [-]
		That was never not the case. There are always value judgements. Even for keyword searches, there are often hundreds of exact matches, and one you want might not even be an exact match. This article is about how Target can use LLMs to help you find patio furniture. I guess you could imagine the LLM upselling you on a more expensive set?
	▲	softwaredoug 5 hours ago \| parent \| prev [-]
		Search engines currently do this for better or worse. But they still want you to buy products. The bigger issue is I’m not sure agents are trained to understand what users find engaging. What makes users click.

▲

jillesvangurp 4 hours ago | parent | prev [-]

Interesting approach. It might be helpful to give the agent more tools though. Some simple aggregations might give it a notion of what's there to query for in the catalog. And a combination of overly broad queries and aggregations (terms, significant terms, etc.) might help it zoom in on interesting results. And of course, relatively large responses are not necessarily as much of a problem for LLMs as they would be for humans.