| ▲ | sulam 2 hours ago | ||||||||||||||||
That’s misunderstanding why these models are behind. A large part of why they’re behind is they aren’t able to do the reinforcement learning post-training steps that takes a pre-trained model and turns it into a frontier model like GPT 5 or Opus. Instead they do their best to recreate these models using distillation. Fundamentally, you can never distill your way to being the teacher, so these approaches will not advance the frontier. [edit, after thinking about it I think my phrasing is unfair. It's not necessarily that aren't able to do it, but they haven't yet shown that they are willing to do it.] | |||||||||||||||||
| ▲ | computerex 2 hours ago | parent | next [-] | ||||||||||||||||
That’s not remotely true. They did distillation as a cheap solution to the cold start problem. You need data/trajectories to hill climb to higher capabilities. All large Chinese labs do RLAIF. | |||||||||||||||||
| |||||||||||||||||
| ▲ | FpUser 2 hours ago | parent | prev [-] | ||||||||||||||||
>"they aren’t able to do the reinforcement learning post-training steps" Not yet. If there is a need someone will come and fulfill. Personally for me now I do not even want to use top models. Professionally I use AI to help with the coding using Junie agent that comes with IDEs from JetBrains. Junie is told to use Gemini Flash and works fine for what I ("I" being an emphasis here) ask it to do. I tried more advanced models and different vendors only to discover credits going down the toilet without any extra benefit. | |||||||||||||||||
| |||||||||||||||||