| ▲ | runeblaze 11 hours ago | |
Is it though? There is a reason gpt has codex variants. RL on a specific task raises the performance on that task | ||
| ▲ | jjmarr 10 hours ago | parent [-] | |
Post-training doesn't transfer over when a new base model arrives so anyone who adopted a task-specific LLM gets burned when a new generational advance comes out. | ||