▲ | moralestapia 4 days ago | ||||||||||||||||
Wow, just amazing. Is this model open? Open weights at least? Can you use it commercially? | |||||||||||||||||
▲ | SweetSoftPillow 4 days ago | parent | next [-] | ||||||||||||||||
This is a Google's Gemini flash 2.5 model with native image output capability. It's fast, relatively cheap and SOTA-quality, and available via API. I think getting this kind of quality in open source models will need some time, probably first from Chinese models and then from BlackForestLabs or Google's open source (Gemma) team. | |||||||||||||||||
▲ | vunderba 4 days ago | parent | prev | next [-] | ||||||||||||||||
Outside of Google Deepmind open sourcing the code and weights of AlphaFold, I don't think they've released any of their GenAI stuff (Imagen, Gemini, Flash 2.5, etc). The best multimodal models that you can run locally right now are probably Qwen-Edit 20b, and Kontext.Dev. | |||||||||||||||||
| |||||||||||||||||
▲ | minimaxir 4 days ago | parent | prev [-] | ||||||||||||||||
Flux Kontext has similar quality, is open weight, and the outputs can be used commercially, however prompt adherence is good-but-not-as-good. |