| ▲ | willwade 2 days ago | |
I wonder if this would have been useful https://github.com/microsoft/presidio - its heavy but looks really good. There is a lite version.. | ||
| ▲ | shaoz 2 days ago | parent | next [-] | |
I've used it, lots of false positives out of the box, you need to do a ton of tuning or put a transformer/BERT model with it, but then at that point it's basically the same thing as the OP's project. | ||
| ▲ | threecheese 2 days ago | parent | prev | next [-] | |
Looks like it uses Googles Langextract, which uses only LLMs for NLP, while OP is using a small NER model that runs locally. | ||
| ▲ | winchester6788 2 days ago | parent | prev [-] | |
full of false positives though. but definitely good for some types of entities and regexes | ||