Remix.run Logo
bobosha 2 days ago

I’m working on a new vision-language model architecture called Onida. Our aim is to match—or surpass—the performance of leading VLMs like LLavA and CogVLM, while operating at a fraction of the cost. Unlike most existing VLMs, which layer vision components onto a language model as an afterthought, Onida is designed from first principles with a truly integrated approach.

This document [1] outlines our key differentiators, and we’re now inviting beta participants to explore and test the technology.

[1] https://healthio.notion.site/Onida-Efficient-VLM-Architectur...