Wow, just amazing.

Is this model open? Open weights at least? Can you use it commercially?

SweetSoftPillow 4 days ago | parent | next [-]

This is a Google's Gemini flash 2.5 model with native image output capability. It's fast, relatively cheap and SOTA-quality, and available via API. I think getting this kind of quality in open source models will need some time, probably first from Chinese models and then from BlackForestLabs or Google's open source (Gemma) team.

▲

vunderba 4 days ago | parent | prev | next [-]

Outside of Google Deepmind open sourcing the code and weights of AlphaFold, I don't think they've released any of their GenAI stuff (Imagen, Gemini, Flash 2.5, etc).

The best multimodal models that you can run locally right now are probably Qwen-Edit 20b, and Kontext.Dev.

https://qwenlm.github.io/blog/qwen-image-edit

https://bfl.ai/blog/flux-1-kontext-dev

▲

SweetSoftPillow 4 days ago | parent [-]

Google also open sources Gemma LLMs and embedding models, which are quite good at the time of release (SOTA or near-SOTA in the open source field).

	▲	vunderba 4 days ago \| parent [-]
		Oh very nice I wasn't aware of that [1] [2]. Adding the links as well. [1] https://deepmind.google/models/gemma [2] https://huggingface.co/google/gemma-7b [2]

▲

minimaxir 4 days ago | parent | prev [-]

Flux Kontext has similar quality, is open weight, and the outputs can be used commercially, however prompt adherence is good-but-not-as-good.