Remix.run Logo
nik736 4 hours ago

I have put together an internal benchmark on 1000s of business documents with weird tables, structure, etc. that I run on every relevant model release. Opus 4.8 performs very very well. But it is obviously overkill for the task (and expensive at doing so). I just wanted to respond to the OP.

Insanity 4 hours ago | parent | next [-]

I'm assuming that the reason I didn't have good success rate is because it was not scanned documents, but photographs, and lighting conditions weren't always ideal. I think scanned business documents are a happy-case scenario in a way. (obv, you seem to run it against some complex documents, so that's impressive)

apawloski 2 hours ago | parent | prev [-]

I’m curious what your findings are for the best model for your use case