The only thing that comes to mind is some kind of timing attack. Send loads of requests specific to a company you’re trying to spy on and if it comes back cached you know someone has sent that prompt recently. Expensive attack, though, with a large search space.

▲

gunalx 3 hours ago | parent [-]

I habe come across turning on caching means the llm has a faint memory of what was in the cache, even to unrelated queries. If this is the case its fully unreasonable to share the cache, because of possibility of information leakage.

	▲	weird-eye-issue 2 hours ago \| parent \| next [-]
		This is absolutely 100% incorrect.
	▲	samwho 3 hours ago \| parent \| prev [-]
		How would information leak, though? There’s no difference in the probability distribution the model outputs when caching vs not caching.