Remix clone Hacker News

new | show | ask | jobs Github

	▲	sidmo 8 months ago
		If you are looking for the latest/greatest in file processing i'd recommend checking out vision language models. They generate embeddings of the images themselves (as a collection of patches) and you can see query matching displayed as a heatmap over the document. Picks up text that OCR misses. My company DataFog has an open-source demo if you want to try it out: https://github.com/DataFog/vlm-api If you're looking for an all-in-one solution, little plug for our new platform that does the above and also allows you to create custom 'patterns' that get picked up via semantic search. Uses open-source models by default, can deploy into your internal network. www.datafog.ai. In beta now and onboarding manually. Shoot me an email if you'd like to learn more!