| ▲ | Show HN: Pseudonymizing sensitive data for LLMs without losing context(atticsecurity.com) | |||||||||||||
| 4 points by n00pn00p 3 days ago | 9 comments | ||||||||||||||
| ▲ | glitchnsec a day ago | parent | next [-] | |||||||||||||
This is really cool - I'm still in V2 with NER for redacting PII before sending to model BUT that was just on simple email analysis. I bet most teams building for security with AI haven't addressed this! Thanks for sharing! | ||||||||||||||
| ▲ | bennettdixon 2 days ago | parent | prev | next [-] | |||||||||||||
Nice write up, one thing that stood out is the V2 to V3 jump. One of my clients is integrating personal wellness & AI, and we took a slightly different route. The health data and personal data live in separate dbs with an encrypted mapping layer between. This way the model only sees health context attached to a unique pseudo-user level session. Your problem almost seems harder, because the PII is the signal/context. One challenge we are facing is re-identification, e.g rich-health profiles being identifiable in themselves. Curious if you have thought about that side of things with your V3 implementation? | ||||||||||||||
| ||||||||||||||
| ▲ | _zer0c00l_ 3 days ago | parent | prev | next [-] | |||||||||||||
I have one (at least) fundamental concern about the approach - let's say I'm building an anti-fraud system that uses AI (through API), and maybe I'm asking AI whether my user totally+fraud@gmail.com is a potential fraudster. By masking this email address I'm sabotaging my own AI prompt - the AI cannot longer reason based on the facts that 1) the email is a free public email 2) the email says 'fraud' right in your face. | ||||||||||||||
| ||||||||||||||
| ▲ | dwa3592 2 days ago | parent | prev | next [-] | |||||||||||||
ooh nice. i built something exactly similar last year. | ||||||||||||||
| ||||||||||||||
| ▲ | n00pn00p 3 days ago | parent | prev [-] | |||||||||||||
[dead] | ||||||||||||||