Remix.run Logo
styanax 5 days ago

> because you end up doing multiple searches in parallel and merging results

This reminds me of the design model of SearX/SearXNG - instead of a distributed forge index, it would distribute the search endpoints of forge instances to facilitate the next steps you outline. It almost feels like a central coordinator or maybe a CDN-like network set of search proxies would be needed to do the actual combining and filtering of results. Maybe it could fit in the Codeberg operational umbrella in some future plan.

In practice Nostr does this step on the client side - one subscribes to relays, then when querying for new content it asks all relays, gets all the duplicate metadata and filters on the client. Huge network use and battery drain on your handheld device, Nostr bouncers have emerged for this exact same reason, a popular software is "Bostr", easy to find examples run by random volunteers but it requires money (disk/cpu/ram): https://bostr.azzamo.net/

vidarh 5 days ago | parent [-]

There are quite a lot of approaches you can take to reduce the cost of this, e.g. sharding by search term, so the number of shards hit for any specific search term is a subset of the total set.

You can also certainly broadly cache the "top" of the hit lists for very common searches, so you don't need to fan out unless you're doing less common searches or going beyond the first "page" of results.