Remix.run Logo
evrenesat 11 hours ago

I've tried to repaint the exterior of my house. More than 20 times with very detailed prompts. I even tried to optimize it with Claude. No matter what, every time it added one, two or three extra windows to the same wall.

cj 11 hours ago | parent | next [-]

I tried this in AI studio just now with nano banana.

Results: https://imgur.com/a/9II0Aip

The white house was the original (random photo from Google). The prompt was "What paint color would look nice? Paint the house."

swatcoder 11 hours ago | parent | next [-]

> (random photo from Google)

Careful with that kind of thing.

Here, it mostly poisons your test, because that exact photo probably exists in the underlying training data and the trained network will be more or less optimized on working with it. It's really the same consideration you'd want to make when testing classifiers or other ML techs 10 years ago.

Most people taking to a task like this will be using an original photo -- missing entirely from any training date, poorly framed, unevenly lit, etc -- and you need to be careful to capture as much of that as possible when trying to evaluate how a model will work in that kind of use case.

The failure and stress points for AI tools are generally kind of alien and unfamiliar because the way they operate is totally different than the way a human operates, and if you're not especially attentive to their weird failure shapes and biases when you want to test them, or you'll easily get false positives (and false negatives) that lead you to misleading conclusions.

cj 11 hours ago | parent [-]

Yea, the base image was the first google image result for the search term "house". So definitely in the training set.

ceejayoz 9 hours ago | parent | prev | next [-]

> The prompt was "What paint color would look nice? Paint the house."

At some point, this is probably gonna result in you coming home to a painted house and a big bill, lol.

vunderba 11 hours ago | parent | prev [-]

Guess they ran out of paint - notice the upper window.

cj 11 hours ago | parent [-]

Oops. Original link wasn't using the Pro version. Edited the comment with an updated link.

fumeux_fume 11 hours ago | parent | prev | next [-]

I also tried that in the past with poor results. I just tried it this morning with nano banana pro and it nailed it with a very short prompt: "Repaint the house white with black trim. Do not paint over brick."

Workaccount2 10 hours ago | parent | prev | next [-]

I don't know what it is with Gemini (and even other models) but I swear they must be doing some kind of active load-dependant quanitization or a/b/c/d testing behind the scenes, because sometimes the model is stellar and hitting everything, and other times it's tripping all over itself.

The most effective fix I have found is that when the model is acting dumb, just turn it off and come back in the few hours to a new chat and try again.

jamil7 10 hours ago | parent [-]

Yeah I think they all shed under heavy load as part of some scaling strategy.

grantpitt 11 hours ago | parent | prev | next [-]

Huh, can you share a link? I tried here: https://gemini.google.com/share/e753745dfc5d

evrenesat 10 hours ago | parent [-]

https://gemini.google.com/share/79fe1a38e440

gandreani 10 hours ago | parent | next [-]

Maybe somewhere in the original comment it would have been fair to mention you can barely see the house in the original photo. This is actually a hilarious complaint

Jaxan 10 hours ago | parent | next [-]

Maybe. But this is not an edge case. I consider this genuine use of the marketed tool.

evrenesat 10 hours ago | parent | prev [-]

That cannot be a valid excuse. Other than adding extra windows to the clearly visible wall, it's obvious that model perfectly capable to "see" the house. It just cannot "believe" that there can be a big empty wall on a garden house.

WesleyJohnson 5 hours ago | parent | prev [-]

https://gemini.google.com/share/3b4d2cd55778

Nemi 6 hours ago | parent | prev | next [-]

I have this problem selecting Pro, but if I use 2.5 Flash it does a great job at these things. I am not sure why Pro does not work as well.

9 hours ago | parent | prev | next [-]
[deleted]
dyauspitr 7 hours ago | parent | prev [-]

Nano Banana Pro is a chatGPT 3.5 to 4 tier leap.