▲ | nowittyusername a month ago | |
I've read the paper before I made the statement. And I still made the statement because there are issues with their paper. The first problem is that the way in which anthropic trains their models and the architecture of their models is different from most of the open source models people use. they are still transformer based, but they are not structurally put together the same as most models, so you cant extrapolate their findings on their models to other models. Their training methods also use a lot more regularization of the data trying to weed out targeted biases as much as possible. meaning that the models are trained on more synthetic data which tries to normalize the data as much as possible between languages, tone, etc.. Same goes for their system prompt, their system prompt is treated differently versus open source models which append the system prompt in front of the users query internally. The attention ais applied differently among other things. Second the way that their models "internalize" the world is vastly different then what humans would thing of "building a world model" of reality. Its hard to put it in to words but basically their models do have a underlying representative structure but its not anything that would be of use in the domains humans care about, "true reasoning". Grokking the concept if you will. Honestly I highly suggest folks take a lot of what anthropic studies with a grain of salt. I feel that a lot of information they present is purposely misinterpreted by their teams for media or pr/clout or who knows what reasons. But the biggest reason is the one i stated at the beginning, most models are not of the same ilk as Anthropic models. I would suggest folks focus on reading interpretability research on open source models as those are most likely to be used by corporations for their cheap api costs. And those models have no where near the care and sophistication put in to them as anthropic models. | ||
▲ | bunderbunder a month ago | parent [-] | |
> I feel that a lot of information they present is purposely misinterpreted by their teams for media or pr/clout or who knows what reasons. I think it's just the culture of machine learning research at this point. Academics are better about it, but still far from squeaky clean. It can't be squeaky clean, because if you aren't willing to make grand overinflated claims to help attract funding, someone else will be, and they'll get the research funding, so they'll be the ones who get to publish research. It's like an anthropic principle of AI research. (rimshot) |