| ▲ | jasonjmcghee 8 months ago | |||||||
I am excitedly waiting for the first company (guessing / hoping it'll be anthropic) to invest heavily in improvements to caching. The big ones that come to mind are cheap long term caching, and innovations in compaction, differential stuff - like is there a way to only use the parts of the cached input context we need? | ||||||||
| ▲ | manmal 8 months ago | parent [-] | |||||||
Isn’t a problem there that a cache would be model specific, where the cached items are only valid for exactly the same weights and inference engine? I think those are both heavily iterated on. | ||||||||
| ||||||||