Remix clone Hacker News

new | show | ask | jobs Github

	▲	vunderba a day ago
		There are actually a few capable VL models out there that can run on even modest hardware. If you want to keep things simple and process everything locally, I’d recommend something like Qwen3 VL [1]. It’s not the fastest model, but you can just let it chew through the docs over a weekend. In my experience, it takes about 15 to 30 seconds per image, but the quality of the results is quite good if a bit verbose [2]. [1] - https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct-FP8 [2] - https://mordenstar.com/other/vlm-xkcd