Remix clone Hacker News

new | show | ask | jobs Github

	▲	r3gal08 21 hours ago
		How are you handling the data extraction? Is it a multimodal VLM (OCR+LLM) or a standard OCR engine feeding a separate LLM? I’ve been hitting a wall trying to understand how this viable. The compute overhead for real-time analysis at scale seems massive without a serious backend. How are you managing the frequency?