| ▲ | ipython 10 hours ago | ||||||||||||||||
As far as I know the only data of the two you identified are cached inside of the inference layer - the KV cache. Then again, I am not an expert in designing and operating inference, so I could be incorrect on that. Either way, both of those are controlled by deterministic code and not the LLM itself. So controlling for that risk is much simpler to model IMO since the mitigation can be applied universally and deterministically rather than hoping and praying some non-deterministic system will respect your wishes. | |||||||||||||||||
| ▲ | wolttam 7 hours ago | parent [-] | ||||||||||||||||
In other words: controlling for that kind of potential data-mixing is the same as in any other application where customer data is co-located within the same running process/memory/storage space. | |||||||||||||||||
| |||||||||||||||||