Remix.run Logo
inexcf 7 days ago

Got excited about an open-source tool doing this.

Alas, i am let down. It is an open-source tool creating the prompt for the OpenAI API and i can't go and send customer data to them.

I'm aware of https://github.com/clovaai/donut so i hoped this would be more like that.

_joel 7 days ago | parent | next [-]

You can self host OpenAPI compatible models with lmstudio and the like. I've used it with https://anythingllm.com/

Tammilore 7 days ago | parent | prev | next [-]

Hi. I totally get the concern about sending data to OpenAI. Right now, Documind uses OpenAI's API just so people could quickly get started and see what it is like, but I’m open to adding options and contributions that would be better for privacy.

inexcf 6 days ago | parent [-]

That sounds great.

sidmo 5 days ago | parent | prev | next [-]

I'd recommend checking out vision language models. They generate embeddings of the images themselves (as a collection of patches) and you can see query matching displayed as a heatmap over the document. Picks up text that OCR misses. I built a simple API over it if you want to try it out: https://github.com/DataFog/vlm-api

turblety 7 days ago | parent | prev | next [-]

You might be able to use Ollama, which has a OpenAI compatible API.

Zambyte 7 days ago | parent [-]

Not without chaning the code (should be easy though)

https://github.com/DocumindHQ/documind/blob/d91121739df03867...

7 days ago | parent | prev [-]
[deleted]