Working on this: https://github.com/KevinXuxuxu/anon_proxy, a sort of anonymization proxy to use with LLM providers. It does model (OpenAI privacy filter) + regex PII detection, and replaces them back-and-forth for API requests and responses. With locally hosted detection model, no PII leaves your local environment. I find it very useful especially when you're working on sensitive documents (legal, tax, immigration etc.), hope you find it helpful as well :)

▲

stingraycharles 3 hours ago | parent | next [-]

How does it handle “unredaction” in responses? E.g. let’s say the LLM does something with the document. You redacted its input, so it emits redacted content. Now what?

	▲	MassiveQuasar 19 minutes ago \| parent [-]
		The way I handled it is by assigning the redacted tag an id which gets translated back to the saved PII in the output.

▲

blfr 3 hours ago | parent | prev [-]

This is very cool because it allows you to use any model. Obviously, it still lets the model and its operator see the entire context of the conversation.

I quite like Moxie's Confer[1] approach to just encrypt the whole thing in such a way that no one except the end-user sees the plaintext.

[1] https://confer.to/