| ▲ | acutesoftware 9 hours ago | |
I am using LangChain with a SQLite database - it works pretty well on a 16G GPU, but I started running it on a crappy NUC, which also worked with lesser results. The real lightbulb moment is when you realise the ONLY thing a RAG passes to the LLM is a short string of search results with small chunks of text. This changes it from 'magic' to 'ahh, ok - I need better search results'. With small models you cannot pass a lot of search results ( TOP_K=5 is probably the limit ), otherwise the small models 'forget context'. It is fun trying to get decent results - and it is a rabbithole, next step I am going into is pre-summarising files and folders. I open sourced the code I was using - https://github.com/acutesoftware/lifepim-ai-core | ||
| ▲ | reactordev 7 hours ago | parent [-] | |
You can expand your context window to something like 100,000 to prevent memory loss. | ||