▲ | therein 4 days ago | |
Probably prior DARPA research or something. Also slightly tangentially, people will tell me it is that it was new and novel and that's why we were impressed but I almost think things went downhill after ChatGPT 3. I felt like 2.5 (or whatever they called it) was able to give better insights from the model weights itself. The moment tool use became a thing and we started doing RAGs and memory and search engine tool use, it actually got worse. I am also pretty sure we are lobotomizing the things that would feel closer to critical thinking by training it to be sensitive of the taboo of the day. I suspect earlier ones were less broken due to that. How would it distinguish and decide between knowing something from training and needing to use a tool to synthesize a response anyway? |