Remix.run Logo
r3gal08 21 hours ago

How are you handling the data extraction? Is it a multimodal VLM (OCR+LLM) or a standard OCR engine feeding a separate LLM? I’ve been hitting a wall trying to understand how this viable. The compute overhead for real-time analysis at scale seems massive without a serious backend. How are you managing the frequency?