▲ | deciduously a day ago | |
This feels like more of a system prompt issue than a model issue? | ||
▲ | waleedlatif1 a day ago | parent | next [-] | |
imo, model regression might actually stem from more aggressive use of toolformer-style prompting, or even RLHF tuning optimizing for obedience over initiative. i bet if you ran comparable tasks across 3-5, 3-7, and 4-0 with consistent prompts and tool access disabled, the underlying model capabilities might be closer than it seems. | ||
▲ | ketzo a day ago | parent | prev | next [-] | |
Anecdotal of course, but I feel a very distinct difference between 3.5 and 3.7 when swapping between them in Cursor’s Agent mode (so the system prompt stays consistent). | ||
▲ | a day ago | parent | prev [-] | |
[deleted] |