I was wondering about this when I was reading around the topic. I can’t personally think of a reason you would need to segregate, though it wouldn’t surprise me if they do for some sort of compliance reasons. I’m not sure though, would love to hear something first-party.

▲

dustfinger 9 minutes ago | parent | next [-]

I wonder if there is valuable information that can be learned by studying a companies prompts? There may be reasons why some companies want their prompts private.

▲

samwho 4 hours ago | parent | prev | next [-]

The only thing that comes to mind is some kind of timing attack. Send loads of requests specific to a company you’re trying to spy on and if it comes back cached you know someone has sent that prompt recently. Expensive attack, though, with a large search space.

▲

gunalx 3 hours ago | parent [-]

I habe come across turning on caching means the llm has a faint memory of what was in the cache, even to unrelated queries. If this is the case its fully unreasonable to share the cache, because of possibility of information leakage.

	▲	weird-eye-issue 2 hours ago \| parent \| next [-]
		This is absolutely 100% incorrect.
	▲	samwho 3 hours ago \| parent \| prev [-]
		How would information leak, though? There’s no difference in the probability distribution the model outputs when caching vs not caching.

▲

weird-eye-issue 2 hours ago | parent | prev [-]

They absolutely are segregated

With OpenAI at least you can specify the cache key and they even have this in the docs:

Use the prompt_cache_key parameter consistently across requests that share common prefixes. Select a granularity that keeps each unique prefix-prompt_cache_key combination below 15 requests per minute to avoid cache overflow.