▲ | ww520 7 days ago | ||||||||||||||||
Mad respect. This is an incredible project to pull together all these technologies. The crown jewel of a search engine is its ranking algorithm. I'm not sure how LLM is being used in this regard in here. One effective old technique for ranking is to capture the search-to-click relationship by real users. It's basically the training data by human mapping the search terms they entered to the links they clicked. With just a few of clicks, the ranking relevance goes way up. May be feeding the data into a neural net would help ranking. It becomes a classification problem - given these terms, which links have higher probabilities being clicked. More people clicking on a link for a term would strengthening the weights. | |||||||||||||||||
▲ | lelanthran 7 days ago | parent [-] | ||||||||||||||||
> One effective old technique for ranking is to capture the search-to-click relationship by real users. It's basically the training data by human mapping the search terms they entered to the links they clicked. With just a few of clicks, the ranking relevance goes way up. That's not very effective. Ever heard of clickbait? Like I've said uncountable times before, the only effective technique to clean out the search results of garbage is to use a point system that penalises each 3rd party advertisement placed on the page. The more adverts, the lower the rank. And the reason that will work is because you are directly addressing the incentive for producing garbage - money! The result should be "when two sites have the same basic content, in the search results promote the one without ads over the ones with ads". Until this is done, search engines will continue serving garbage, because they are rewarding those actors who are producing garbage. | |||||||||||||||||
|