Remix.run Logo
mooiedingen 2 days ago

Nothing new as it has been done before, the concept is simple enough: step 1: indexer, solr/lucene Step 2: crawler of which there are several foss, build one yourself? or you just run yacy which is a combo of the above, hook combine with an oldschool searx instance and you will be granted the title as seeker by the spirit of Fravia+ who was elder of the searchlores!!! Not only will you filter crap made by machine learning models, but thou shall find what thou seek! I refuse to call a 16 line long for loop triggering in memory loaded tokenized data where data can be anything from a scientific paper hallucinated by a chatbot to a message between two lovers anything intelligent for it is not intelligence but a blob of tokenized fcking data in memory getting triggered for an output by a derp with a 16 line long for loop!!!

rurban 2 days ago | parent [-]

xapian is easier and faster. No Java memory eater.

I've once built a good company wide search engine with custom crawlers, and result hooks, eg to crazy SAP or other ticket systems. Gmane was also legendary.