| ▲ | dathinab 2 hours ago | |
except you don't want knowledge in the model, and most of that "size" comes from "encoded knowledge", i.e. over fitting. The goal should be to only have language handling in the model, and the knowledge in a database you can actually update, analyze etc. It's just really hard to do so. "world models" (for cars) maybe make sense for self driving, but they are also just a crude workaround to have a physics simulation to push understanding of physics. Through in difference to most topics, basic, physics tend to not change randomly and it's based on observation of reality, so it probably can work. Law, health advice, programming stuff etc. on the other hand changes all the time and is all based on what humans wrote about it. Which in some areas (e.g. law or health) is very commonly outdated, wrong or at least incomplete in a dangerous way. And for programming changes all the time. Having this separation of language processing and knowledge sources is ... hard, language is messy and often interleaves with information. But this is most likely achievable with smaller models. Actually it might even be easier with a small model. (Through if the necessary knowledge bases are achievable to fit on run on a mac is another topic...) And this should be the goal of AI companies, as it's the only long term sustainable approach as far as I can tell. I say should because it may not be, because if they solve it that way and someone manages to clone their success then they lose all their moat for specialized areas as people can create knowledge bases for those areas with know-how OpenAI simple doesn't have access to. (Which would be a preferable outcome as it means actual competition and a potential fair working market.) | ||
| ▲ | dathinab 2 hours ago | parent [-] | |
as a concrete outdated case: TLS cipher X25519MLKEM768 is recommended to be enabled on servers which do support it last time I checked AI didn't even list it when you asked it for a list of TLS 1.3 ciphers (through it has been widely supported since even before it was fully standardized..) this isn't surprising as most input sources AI can use for training are outdated and also don't list it maybe someone of OpenAI will spot this and feet it explicitly into the next training cycle, or people will cover it more and through this it is feed implicitly there but what about all that many niche but important information with just a handful of outdated stack overflow posts or similar? (which are unlikely to get updated now that everyone uses AI instead..) The current "lets just train bigger models with more encoded data approach" just doesn't work, it can get you quite far, tho. But then hits a ceiling. And trying to fix it by giving it also additional knowledge "it can ask if it doesn't know" has so far not worked because it reliably doesn't realize it doesn't know if it has enough outdated/incomplete/wrong information encoded in the model. Only by assuring it doesn't have any specialized domain knowledge can you make sure that approach works IMHO. | ||