| ▲ | waynecochran 13 hours ago | ||||||||||||||||
Conclusion: both are true which makes sense. The KV cache scaling yields both the emergent power and requires the enormous capacity. | |||||||||||||||||
| ▲ | JKCalhoun 13 hours ago | parent [-] | ||||||||||||||||
Which does sort of hint at a (power/profitability) ceiling on the LLM line of AI… That should make the industry nervous. | |||||||||||||||||
| |||||||||||||||||