Remix.run Logo
moralestapia 4 days ago

Wow, just amazing.

Is this model open? Open weights at least? Can you use it commercially?

SweetSoftPillow 4 days ago | parent | next [-]

This is a Google's Gemini flash 2.5 model with native image output capability. It's fast, relatively cheap and SOTA-quality, and available via API. I think getting this kind of quality in open source models will need some time, probably first from Chinese models and then from BlackForestLabs or Google's open source (Gemma) team.

vunderba 4 days ago | parent | prev | next [-]

Outside of Google Deepmind open sourcing the code and weights of AlphaFold, I don't think they've released any of their GenAI stuff (Imagen, Gemini, Flash 2.5, etc).

The best multimodal models that you can run locally right now are probably Qwen-Edit 20b, and Kontext.Dev.

https://qwenlm.github.io/blog/qwen-image-edit

https://bfl.ai/blog/flux-1-kontext-dev

SweetSoftPillow 4 days ago | parent [-]

Google also open sources Gemma LLMs and embedding models, which are quite good at the time of release (SOTA or near-SOTA in the open source field).

vunderba 4 days ago | parent [-]

Oh very nice I wasn't aware of that [1] [2]. Adding the links as well.

[1] https://deepmind.google/models/gemma

[2] https://huggingface.co/google/gemma-7b [2]

minimaxir 4 days ago | parent | prev [-]

Flux Kontext has similar quality, is open weight, and the outputs can be used commercially, however prompt adherence is good-but-not-as-good.