Remix.run Logo
samwho 4 hours ago

I was wondering about this when I was reading around the topic. I can’t personally think of a reason you would need to segregate, though it wouldn’t surprise me if they do for some sort of compliance reasons. I’m not sure though, would love to hear something first-party.

dustfinger 9 minutes ago | parent | next [-]

I wonder if there is valuable information that can be learned by studying a companies prompts? There may be reasons why some companies want their prompts private.

samwho 4 hours ago | parent | prev | next [-]

The only thing that comes to mind is some kind of timing attack. Send loads of requests specific to a company you’re trying to spy on and if it comes back cached you know someone has sent that prompt recently. Expensive attack, though, with a large search space.

gunalx 3 hours ago | parent [-]

I habe come across turning on caching means the llm has a faint memory of what was in the cache, even to unrelated queries. If this is the case its fully unreasonable to share the cache, because of possibility of information leakage.

weird-eye-issue 2 hours ago | parent | next [-]

This is absolutely 100% incorrect.

samwho 3 hours ago | parent | prev [-]

How would information leak, though? There’s no difference in the probability distribution the model outputs when caching vs not caching.

weird-eye-issue 2 hours ago | parent | prev [-]

They absolutely are segregated

With OpenAI at least you can specify the cache key and they even have this in the docs:

Use the prompt_cache_key parameter consistently across requests that share common prefixes. Select a granularity that keeps each unique prefix-prompt_cache_key combination below 15 requests per minute to avoid cache overflow.