Remix.run Logo
Tsarp 3 days ago

Ive worked on some enterprise NER systems (specifically privacy/redaction), and in almost all cases the cost of missing out masking was significantly higher than latency (ofc in an ideal world youd have both).

And in all the research we did, the best solutions ended up passing through a workflow of 1.NN based NER, 2.Regex and 3.Dictionary look ups to really clean information. Using a single method worked well in customer demos but always ended up in what we thought were edge cases in prod.

That being said, latency stuff makes sense. This might work great in conversational use cases. Picking out intent and responding. Every millisecond helps in making things sound natural.