Remix.run Logo
namibj 4 days ago

After looking at Cases 4, 9, 23, 33, and 61, I think it might be suited to take in several wide-angle pictures or photospheres or such from inside a residence, and output a corresponding floor plan schematic.

If anyone has examples, guides, or anything to save me from pouring unnecessary funds into those API credits just to figure out how to feed it for this kind of task, I'd really appreciate sharing.

vunderba 4 days ago | parent [-]

I can't provide a definitive answer for this - but I will say that the Google's SDK docs state that a single edit request is limited to a maximum of THREE images so depending on how many you have - you might have to sort of use the "Kontext Kludge", aka stitching together many of input images into a single JPEG.

https://cloud.google.com/vertex-ai/generative-ai/docs/models...

namibj 2 days ago | parent | next [-]

I can report it to not really work well, or to require prompting beyond my skills.

If someone has an example that works, I'd love to see.

namibj 2 days ago | parent | prev [-]

Thanks a lot! I hadn't realized before that the `-image-preview` variant of Gemini-2.5-flash had such an annoying limitation.