Remix.run Logo
fhub 9 hours ago

Our solve is to allow it to work with a local dev database and it's output is a script. Then that script gets checked into version control (auditable and reviewed). Then that script can be run against production. Slower iteration but worth the tradeoff for us.

Giving LLM even read access to PII is a big "no" in my book.

On PII, if you need LLMs to work on production extracted data then https://github.com/microsoft/presidio is a pretty good tool to redact PII. Still needs a bit of an audit but as a first pass does a terrific job.

Volundr 8 hours ago | parent | next [-]

This. Everything your LLM reads from your database, server, whatever is being sent to your LLM provider. Unless your LLM is local running on your own systems, it shouldn't be given ANY access to production data without vetting it through legal with an eye to your privacy policy and compliance requirements.

maxkfranz 7 hours ago | parent | prev | next [-]

The script method is great, and it's generalisable to things outside of DB access.

E.g. I used this method when I wanted to carry out a large (almost every source file) refactoring of Cytoscape.js. I fed the LLM a bunch of examples, and I told it to write a script to carry out the refactoring (largely using regex). I reviewed the script, ran the script, and then the code base was refactored.

At the time, agents were not capable enough of doing large-scale refactors directly, as far as I was aware. And the script was probably much faster, anyway.

hephaes7us 7 hours ago | parent | prev [-]

Agreed - I run an entire second dev environment for LLMs.

Claude code runs in a container, and I just connect that container to the right network.

It's nice to be able to keep mid-task state in that environment without stepping on my own toes. It's easy to control what data is accessible in there, even if I have to work with real data in my dev environment.