| ▲ | aziis98 3 hours ago | |
> Pointing capability: Gemini 3 has the ability to point at specific locations in images by outputting pixel-precise coordinates. Sequences of 2D points can be strung together to perform complex tasks, such as estimating human poses or reflecting trajectories over time Does somebody know how to correctly prompt the model for these tasks or even better provide some docs? The pictures with the pretty markers are appreciated but that section is a bit vague and without references | ||
| ▲ | themanmaran 26 minutes ago | parent | next [-] | |
Simon Wilson has some good blogs on this: https://simonwillison.net/2024/Aug/26/gemini-bounding-box-vi... | ||
| ▲ | atonse 3 hours ago | parent | prev | next [-] | |
For my CMS I’d love to get an AI to nicely frame a picture in certain aspect ratios. Like of I provide an image, give me coordinates for a widescreen, square, portrait, and 4x3 using a photographers eye. Any model that can do that? I tried looking in huggingface but didn’t quite see anything. | ||
| ▲ | inquirerGeneral 7 minutes ago | parent | prev [-] | |
[dead] | ||