| ▲ | gunalx 4 hours ago | |
I habe come across turning on caching means the llm has a faint memory of what was in the cache, even to unrelated queries. If this is the case its fully unreasonable to share the cache, because of possibility of information leakage. | ||
| ▲ | weird-eye-issue 2 hours ago | parent | next [-] | |
This is absolutely 100% incorrect. | ||
| ▲ | samwho 3 hours ago | parent | prev [-] | |
How would information leak, though? There’s no difference in the probability distribution the model outputs when caching vs not caching. | ||