| ▲ | m00dy 3 hours ago | ||||||||||||||||
RAG is broken when you have too much data. | |||||||||||||||||
| ▲ | plingamp 2 hours ago | parent | next [-] | ||||||||||||||||
Specifically when the document number reaches around 10k+, a phenomenon called "Semantic Collapse" occurs. https://dho.stanford.edu/wp-content/uploads/Legal_RAG_Halluc... | |||||||||||||||||
| ▲ | thunky 3 hours ago | parent | prev | next [-] | ||||||||||||||||
Gemini with Google search is RAG using all public data, and it isn't broken. | |||||||||||||||||
| |||||||||||||||||
| ▲ | PlatoIsADisease 2 hours ago | parent | prev [-] | ||||||||||||||||
Cant you make thresholds higher? Hmm... I guess not, you might want all that data. Super interesting topic. Learning a lot. | |||||||||||||||||