| ▲ | anttiharju a day ago | |
> It's like a genie, take care with what you wish for. I used to be stuck with this thought. But I came across this delightful documentation RAG project and got to chat with the devs. Idea was that people can ask natural language questions and they get shown the relevant chunk of docs for that query. They were effectively pleading to a genie if I understood it right. Worse yet, the genie/LLM model kept updating weekly from the cloud platform they were using. But the devs were engineers. They had a sample set of docs and sample set of questions that they knew the intended chunk for. So after model updates they ran the system through this test matrix and used it as feedback for tuning the system prompt. They said they had been doing it for a few months with good results, search remaining capable over time despite model changes. While these agents.md etc. appear to be useful, I'm not sure they're going to be the key for long-term success. Maybe with a model change it becomes much less effective and the previous hours spent on it become wasteful. I think something more verifiable/strict is going to be the secret sauce for llm agents. Engineering. I have heard claude code has decent scaffolding. Haven't gotten the chance to play with it myself though. I liked the headline from some time ago that 'what if LLMs are just another piece of technology'? | ||