▲ | jbarrow 4 hours ago | |
Training ML models for PDF forms. You can try out what I’ve got so far with this service that automatically detects where fields should go and makes PDFs fillable: https://detect.semanticdocs.org/ Code and models are at: https://github.com/jbarrow/commonforms That’s built on a dataset and paper I wrote called CommonForms, where I scraped CommonCrawl for hundreds of thousands of fillable form pages and used that as a training set: https://arxiv.org/abs/2509.16506 Next step is training and releasing some DETRs, which I think will drive quality even higher. But the ultimate end goal is working on automatic form accessibility. |