▲ | littlestymaar 6 days ago | |||||||||||||||||||||||||
> they’re specialised tools and this one is not designed to have broad “common sense” in that way. Except the key property of language models compared to other machine learning techniques is their ability to have this kind of common sense understanding of the meaning of natural language. > you don’t understand the use case of this enough to be commenting on it at all quite frankly. That's true that I don't understand the use-case for a language model that doesn't have a grasp of what first/second/third mean. Sub-1B models are supposed to be fine-tuned to be useful, but if the base model is so bad at language it can't make the difference between first and second and you need to put that in your fine-tuning as well as your business logic, why use a base model at all? Also, this is a clear instance of moving the goalpost, as the comment I responded to was talking about how we should not expect such a small model to have “encyclopedic knowledge”, and now you are claiming we should not expect such a small language model to make sense of language… | ||||||||||||||||||||||||||
▲ | jama211 6 days ago | parent [-] | |||||||||||||||||||||||||
Don’t put words in my mouth, I didn’t say that, and no goalposts have been moved. You don’t understand how tiny this model is or what it’s built for. Don’t you get it? This model PHYSICALLY COULDN’T be this small and also have decent interactions on topics outside its specialty. It’s like you’re criticising a go kart for its lack of luggage carrying capacity. It’s simply not what it’s built for, you’re just defensive because you know deep down you don’t understand this deeply, which you reveal again and again at every turn. It’s ok to accept the responses of people in this thread who are trying to lead you to the truth of this matter. | ||||||||||||||||||||||||||
|