| ▲ | energy123 4 hours ago | |
Coding, for some future definition of "small-model" that expands to include today's frontier models. What I commented a few days ago on codex-spark release: """ We're going to see a further bifurcation in inference use-cases in the next 12 months. I'm expecting this distinction to become prominent: (A) Massively parallel (optimize for token/$) (B) Serial low latency (optimize for token/s). Users will switch between A and B depending on need. Examples of (A): - "Use subagents to search this 1M line codebase for DRY violations subject to $spec." An example of (B): - "Diagnose this one specific bug." - "Apply these text edits". (B) is used in funnels to unblock (A). """ | ||