| ▲ | capevace 6 hours ago |
| Seems like the industry is moving further towards having low-latency/high-speed models for direct interaction, and having slow, long thinking models for longer tasks / deeper thinking. Quick/Instant LLMs for human use (think UI).
Slow, deep thinking LLMs for autonomous agents. |
|
| ▲ | gaigalas 6 hours ago | parent | next [-] |
| You always want faster feedback. If not a human leveraging the fast cycles, another automated system (eg CI). Slow, deep tasks are mostly for flashy one-shot demos that have little to no practical use in the real world. |
| |
| ▲ | foobar10000 4 hours ago | parent [-] | | I mean, yes, one always does want faster feedback - cannot argue with that! But some of the longer stuff - automating kernel fusion, etc, are just hard problems. And a small model - or even most bigger ones, will not get the direction right… | | |
| ▲ | gaigalas 4 hours ago | parent [-] | | From my experience, larger models also don't get the direction right a surprising amount of times. You just take more time to notice when it happens, or start to be defensive (over-specing) to account for the longer waits. Even the most simple task can appear "hard" with that over spec'd approach (like building a react app). Iterating with a faster model is, from my perspective, the superior approach. Doesn't matter the task complexity, the quick feedback more than compensates for it. |
|
|
|
| ▲ | varispeed 6 hours ago | parent | prev [-] |
| Are they really thinking or are they sprinkling them with Sleep(x)? |