Remix.run Logo
anuramat 5 days ago

That's how BERT is trained, masked language modeling

dsign 5 days ago | parent [-]

I've used BERT to do that sort of thing. It was a prototype and I was using Pytorch, also, I'm not an expert on Pytorch performance. I also tried with models that succeeded BERT for masked token. My issue with it is that it was slow :-( . My second issue with it is that it wasn't integrated in my favorite word editor. But definitively useful.

anuramat 3 days ago | parent [-]

Did you try any diffusion models? They should be quick enough