My advice for building something like this: don't get hung up on a need for vector databases and embedding.

Full text search or even grep/rg are a lot faster and cheaper to work with - no need to maintain a vector database index - and turn out to work really well if you put them in some kind of agentic tool loop.

The big benefit of semantic search was that it could handle fuzzy searching - returning results that mention dogs if someone searches for canines, for example.

Give a good LLM a search tool and it can come up with searches like "dog OR canine" on its own - and refine those queries over multiple rounds of searches.

Plus it means you don't have to solve the chunking problem!

▲

cwmoore an hour ago | parent | next [-]

I recently came across a “prefer the most common synonym” problem, in Google Maps, while searching for a poolhall—even literally ‘billiards’ returned results for swimming pools and chlorine. I wonder if some more NOTs aren’t necessary…interested in learning about RAGs though I’m a little behind the curve.

▲

mips_avatar 3 hours ago | parent | prev | next [-]

In my app the best lexical search approaches completely broke my agent. For my rag system the llm would on average take 2.1 lexical searches to get the results it needed. Which wasn’t terrible but it meant sometimes it needed up to 5 searches to find it which blew up user latency. Now that I have a hybrid semantic search + lexical search it only requires 1.1 searches per result.

▲

froobius 3 hours ago | parent | prev | next [-]

Hmm it can capture more than just single words though, e.g. meaningful phrases or paragraphs that could be written in many ways.

▲

3 hours ago | parent | prev | next [-]

[deleted]

▲

leetrout 4 hours ago | parent | prev | next [-]

Simon have you ever given a talk or written about this sort of pragmatism? A spin on how to achieve this with Datasette is an easy thing to imagine IMO.

	▲	simonw 13 minutes ago \| parent [-]
		I did a livestream thing about building RAG against FTS search in Datasette last year: https://simonwillison.net/2024/Jun/21/search-based-rag/

▲

tra3 4 hours ago | parent | prev | next [-]

I built a simple emacs package based on this idea [0]. It works surprisingly well, but I dont know how far it scales. It's likely not as frugal from a token usage perspective.

0: https://github.com/dmitrym0/dm-gptel-simple-org-memory

▲

pstuart an hour ago | parent | prev | next [-]

Perhaps SQLite with FTS5? Or even better, getting DuckDB into the party as it's ecosystem seems ripe for this type of work.

▲

enraged_camel 4 hours ago | parent | prev [-]

Yes, exactly. We have our AI feature configured to use our pre-existing TypeSense integration and it's stunningly competent at figuring out exactly what search queries to use across which collections in order to find relevant results.

	▲	busssard 3 hours ago \| parent [-]
		if this is coupled with powerful search engines beyond elastic then we are getting somewhere. other nonmonotonic engines that can find structural information are out there.