▲ | righthand 4 days ago | |
> But we still don't have a solution to search projects on potentially thousands of servers, including self-hosted ones. Why do you need a search index on your self hosted git server? Doesn’t Kagi solve that? | ||
▲ | goku12 4 days ago | parent [-] | |
> Why do you need a search index on your self hosted git server The search index doesn't have to be on your server, does it? What if there is an external (perhaps distributed/replicated) index that you could submit the relevant information to? Or if an external crawler could collect it on your behalf? (some sort of verification will also be needed.) There are two reasons why such a dedicated index is useful. The first is that the general index is too full of noise. That's why people search projects directly on Github. The second problem is that the generic crawlers aren't very good at extracting relevant structured information from source projects, especially the information that the project owner wants to advertise. For example the readme, contribution guidelines, project status, installation and usage information, language(s), license(s), CoC, issue tracker location, bug and security reporting information, keywords, project type, etc. Github and sourcegraph allow you to do precise searches based on those. Try using a regular search engine to locate an obscure project that you already know about. |