▲ | transcriptase 3 days ago | |
[flagged] | ||
▲ | kbenson 3 days ago | parent | next [-] | |
The only way I can understand you coming to that conclusion is if you assumed that's what they were going to be and didn't actually read any of them. | ||
▲ | BoorishBears 3 days ago | parent | prev [-] | |
No it doesn't. The only negative comments are about the cringey presentation. I spend a lot of time post-training models to rid them of their "default alignment", I'd have loved if this did something interesting, but reading the technical report I get the impression they spent more effort on the branding than the actual model. What I'm wondering is honestly if they post-trained Llama 3 405B again because they don't care enough to figure out a new post-training target or if it was a realization they'd get worse-than-baseline performance out of any recent release with their current approach. |