Pretty great pelican: https://simonwillison.net/2026/Feb/19/gemini-31-pro/ - took over 5 minutes though, but I think that's because they're having performance teething problems on launch day.

▲ embedding-shape 4 hours ago | parent | next [-]

It's an excellent demonstration of the main issue I have with the Gemini family of models, they always go "above and beyond" to do a lot of stuff, even if I explicitly prompt against it. In this case, most of the SVG ends up consisting not just of a bike and a pelican, but clouds, a sun, a hat on the pelican and so much more.

Exactly the same thing happens when you code, it's almost impossible to get Gemini to not do "helpful" drive-by-refactors, and it keeps adding code comments no matter what I say. Very frustrating experience overall.

▲ mullingitover 4 hours ago | parent | next [-]

> it's almost impossible to get Gemini to not do "helpful" drive-by-refactors

Just asking "Explain what this service does?" turns into

[No response for three minutes...]

+729 -522

▲ cowmoo728 3 hours ago | parent | next [-]

it's also so aggressive about taking out debug log statements and in-progress code. I'll ask it to fill in a new function somewhere else and it will remove all of the half written code from the piece I'm currently working on.

▲

chankstein38 3 hours ago | parent [-]

I ended up adding a "NEVER REMOVE LOGGING OR DEBUGGING INFO, OPT TO ADD MORE OF IT" to my user instructions and that has _somewhat_ fixed the problem but introduced a new problem where, no matter what I'm talking to it about, it tries to add logging. Even if it's not a code problem. I've had it explain that I could setup an ESP32 with a sensor so that I could get logging from it then write me firmware for it.

▲

sd9 3 hours ago | parent | next [-]

If it's adding too much logging now, have you tried softening the instruction about adding more?

"NEVER REMOVE LOGGING OR DEBUGGING INFO. If unsure, bias towards introducing sensible logging."

Or just

"NEVER REMOVE LOGGING OR DEBUGGING INFO."

▲

bratwurst3000 3 hours ago | parent | prev [-]

"I've had it explain that I could setup an ESP32 with a sensor so that I could get logging from it then write me firmware for it." lol did you try it? This so far from everything ratinonal

	▲	2 hours ago \| parent [-]
		[deleted]

▲ BartShoot 3 hours ago | parent | prev | next [-]

if you had to ask it obviously needs to refactor code for clarity so next person does not need to ask

▲ quotemstr 3 hours ago | parent | prev | next [-]

What. You don't have yours ask for edit approval?

▲ embedding-shape 3 hours ago | parent | next [-]

Who has time for that? This is how I run codex: `codex --sandbox danger-full-access --dangerously-bypass-approvals-and-sandbox --search exec "$PROMPT"`, having to approve each change would effectively destroy the entire point of using an agent, at least for me.

Edit: obviously inside something so it doesn't have access to the rest of my system, but enough access to be useful.

▲ well_ackshually 35 minutes ago | parent | next [-]

>Who has time for that?

People that don't put out slop, mostly.

▲ quotemstr 2 hours ago | parent | prev [-]

I wouldn't even think of letting an agent work in that made. Even the best of them produce garbage code unless I keep them on a tight leash. And no, not a skill issue.

What I don't have time to do is debug obvious slop.

	▲	kees99 2 hours ago \| parent [-]
		I ended up running codex with all the "danger" flags, but in a throw-away VM with copy-on-write access to code folders. Built-in approval thing sounds like a good idea, but in practice it's unusable. Typical session for me was like: `About to run "sed -n '1,100p' example.cpp", approve? About to run "sed -n '100,200p' example.cpp", approve? About to run "sed -n '200,300p' example.cpp", approve?` Could very well be a skill issue, but that was mighty annoying, and with no obvious fix (options "don't ask again for ...." were not helping).

▲ mullingitover an hour ago | parent | prev [-]

Ask mode exists, I think the models work on the assumption that if you're allowing edits then of course you must want edits.

▲ kylec 4 hours ago | parent | prev | next [-]

"I don't know what did it, but here's what it does now"

▲ SignalStackDev 3 hours ago | parent | prev [-]

[dead]

▲ h14h 2 hours ago | parent | prev | next [-]

Would be really interesting to see an "Eager McBeaver" bench around this concept. When doing real work, a model's ability to stay within the bounds of a given task has almost become more important than its raw capabilities now that every frontier model is so dang good.

Every one of these models is so great at propelling the ship forward, that I increasingly care more and more about which models are the easiest to steer in the direction I actually want to go.

▲

cglan 2 hours ago | parent [-]

being TOO steerable is another issue though.

Codex is very steerable to a fault, and will gladly "monkey paw" your requests to a fault.

Claude Opus will ignore your instructions and do what it thinks is "right" and just barrel forward.

Both are bad and papering over the actual issue which is these models don't really have the ability to actually selectively choose their behavior per issue (ie ask for followup where needed, ignore users where needed, follow instructions where needed). Behavior is largely global

	▲	kees99 2 hours ago \| parent [-]
		I my experience Claude gradually stops being opinionated as task at hand becomes more arcane. I frequently add "treat the above as a suggestion, and don't hesitate to push back" to change requests, and it seems to help quite a bit.

▲ enobrev 4 hours ago | parent | prev | next [-]

I have the same issue. Even when I ask it to do code-reviews and very explicitly tell it not to change files, it will occasionally just start "fixing" things.

	▲	mikepurvis 3 hours ago \| parent [-]
		I find Copilot leans the other way. It'll myopically focus its work in the exact function I point it at, even when it's clear that adding a new helper would be a logical abstraction to share behaviour with the function right beside it. Overall, I think it's probably better that it stay focused, and allow me to prompt it with "hey, go ahead and refactor these two functions" rather than the other way around. At the same time, really the ideal would be to have it proactively ask, or even pitch the refactor as a colleague would, like "based on what I see of this function, it would make most sense to XYZ, do you think that makes sense? <sure go ahead> <no just keep it a minimal change>" Or perhaps even better, simply pursue both changes in parallel and present them as A/B options for the human reviewer to select between.

▲ neya 3 hours ago | parent | prev | next [-]

> it's almost impossible to get Gemini to not do "helpful" drive-by-refactors

This has not been my experience. I do Elixir primarily and Gemini has helped build some really cool products and massive refactors along the way. And it would even pick up security issues and potential optimizations along the way

What HAS been an issue constantly though was randomly the model will absolutely not respond at all and some random error would occur which is embarrassing for a company like Google with the infrastructure they own.

	▲	embedding-shape 3 hours ago \| parent [-]
		Out of curiosity, do you have any public projects (with public source code) you've made exclusively with Gemini, so one could take a look? I've tried a bunch of times to use Gemini to at least finish something small but I always end up sufficiently frustrated to abort it as the instruction-following seems so bad.

▲ msteffen 2 hours ago | parent | prev | next [-]

> it's almost impossible to get Gemini to not do "helpful" drive-by-refactors

Not like human programmers. I would never do this and have never struggled with it in the past, no...

	▲	embedding-shape an hour ago \| parent [-]
		Fairer comparison would be against other models, which are typically better at instruction following. You say "don't change anything not explicitly mentioned" or "Don't add any new code comments" and they tend to follow that.

▲ apitman 2 hours ago | parent | prev | next [-]

This matches my experience using Gemini CLI to code. It would also frequently get stuck in loops. It was so bad compared to Codex that I feel like I must have been doing something fundamentally wrong.

▲ tyfon 3 hours ago | parent | prev | next [-]

I was using gemini antigravity in opencode a few weeks ago before they started banning everyone for that and I got into the habit of writing "do x, then wait for instructions".

That helped quite a bit but it would still go off on it's own from time to time.

▲ JLCarveth 3 hours ago | parent | prev | next [-]

Every time I have tried using `gemini-cli` it just thinks endlessly and never actually gives a response.

▲ gavinray 4 hours ago | parent | prev | next [-]

Do you have Personalization Instructions set up for your LLM models?

You can make their responses fairly dry/brief.

▲ embedding-shape 4 hours ago | parent | next [-]

I'm mostly using them via my own harnesses, so I have full control of the system prompts and so on. And no matter what I try, Gemini keeps "helpfully" adding code comments every now and then. With every other model, "- Don't add code comments" tends to be enough, but with Gemini I'm not sure how I could stop the comments from eventually appearing.

▲

WarmWash 4 hours ago | parent | next [-]

I'm pretty sure it writes comments for itself, not for the user. I always let the models comment as much as they want, because I feel it makes the context more robust, especially when cycling contexts often to keep them fresh.

There is a tradeoff though, as comments do consumer context. But I tend to pretty liberally dispense of instances and start with a fresh window.

	▲	embedding-shape 4 hours ago \| parent [-]
		> I'm pretty sure it writes comments for itself, not for the user Yeah, that sounds worse than "trying to helpful". Read the code instead, why add indirection in that way, just to be able to understand what other models understand without comments?

▲

4 hours ago | parent | prev [-]

[deleted]

▲ metal_am 4 hours ago | parent | prev [-]

I'd love to hear some examples!

	▲	gavinray 4 hours ago \| parent \| next [-]
		I use LLM's outside of work primarily for research on academic topics, so mine is: `Be a proactive research partner: challenge flawed or unproven ideas with evidence; identify inefficiencies and suggest better alternatives with reasoning; question assumptions to deepen inquiry.`
	▲	3 hours ago \| parent \| prev \| next [-]
		[deleted]
	▲	ai4prezident 3 hours ago \| parent \| prev [-]
		[dead]

▲ zengineer 4 hours ago | parent | prev [-]

true, whenever I ask Gemini to help me with a prompt for generating an image of XYZ, it generates the image.

▲ jasonjmcghee 2 hours ago | parent | prev | next [-]

What's crazy is you've influenced them to spend real effort ensuring their model is good at generating animated svgs of animals operating vehicles.

The most absurd benchmaxxing.

https://x.com/jeffdean/status/2024525132266688757?s=46&t=ZjF...

▲

simonw an hour ago | parent | next [-]

I like how they also did a frog on a penny-farthing and a giraffe driving a tiny car and an ostrich on roller skates and a turtle kickflipping a skateboard and a dachshund driving a stretch limousine.

	▲	jasonjmcghee an hour ago \| parent \| next [-]
		Ok Google what are some other examples like a pelican riding a bicycle
	▲	simultsop 25 minutes ago \| parent \| prev [-]
		reminds me of andor, luthen, positive reinforcing wasting time of emperor

▲

threatofrain 2 hours ago | parent | prev | next [-]

Animated SVG is huge. People in different professions are worrying to different degrees in terms of being replaced by ML, but this one is huge with regards to digital art.

▲

eurekin 2 hours ago | parent | prev | next [-]

Can't wait until they finally get to real world CAD

	▲	tngranados 2 hours ago \| parent [-]
		There's a CAD example in that same thread: https://x.com/JeffDean/status/2024528776856817813

▲

tantalor 2 hours ago | parent | prev | next [-]

He's svg-mogging

▲

gnatolf 2 hours ago | parent | prev | next [-]

So let's put things we're interested in in the benchmarks.

I'm not against pelicans!

▲

ghurtado an hour ago | parent [-]

I think the reason the pelican example is great is because it's bizarre enough that it's unlikely that to appear in the training as one unified picture.

If we picked something more common, like say, a hot dog with toppings, then the training contamination is much harder to control.

	▲	rvnx an hour ago \| parent [-]
		It's the most common SVG test, it's the equivalent of Will Smith eating spaghettis, so obviously they benchmax toward it

▲

casey2 an hour ago | parent | prev | next [-]

You don't have to benchmax everything, just the benchmarks in the right social circles

▲

UltraSane 2 hours ago | parent | prev [-]

It if funny to think that Jeff Dean personally worked to optimize the pelican riding a bike benchmark.

▲ culi 11 minutes ago | parent | prev | next [-]

Cost per task has increased 4.2x but their ARC-AGI-2 score went from 33.6% to 77.1%

Cost per task is still significantly lower than Opus. Even Opus 4.5

https://arcprize.org/leaderboard

▲ MrCheeze 3 hours ago | parent | prev | next [-]

Does anyone understand why LLMs have gotten so good at this? Their ability to generate accurate SVG shapes seems to greatly outshine what I would expect, given their mediocre spatial understanding in other contexts.

▲

tedsanders 2 hours ago | parent | next [-]

A few thoughts:

- One thing to be aware of is that LLMs can be much smarter than their ability to articulate that intelligence in words. For example, GPT-3.5 Turbo was beastly at chess (1800 elo?) when prompted to complete PGN transcripts, but if you asked it questions in chat, its knowledge was abysmal. LLMs don't generalize as well as humans, and sometimes they can have the ability to do tasks without the ability to articulate things that feel essential to the tasks (like answering whether the bicycle is facing left or right).

- Secondly, what has made AI labs so bullish on future progress over the past few years is that they see how little work it takes to get their results. Often, if an LLM sucks at something that's because no one worked on it (not always, of course). If you directly train a skill, you can see giant leaps in ability with fairly small effort. Big leaps in SVG creation could be coming from relatively small targeted efforts, where none existed before.

▲

emp17344 33 minutes ago | parent [-]

We’re literally at the point where trillions of dollars have been invested in these things and the surrounding harnesses and architecture, and they still can’t do economically useful work on their own. You’re way too bullish here.

	▲	dbeardsl 10 minutes ago \| parent [-]
		Neither do cars until very recently. A tool doesn't have to be unsupervised to be useful.

▲

simonw 3 hours ago | parent | prev | next [-]

My best guess is that the labs put a lot of work into HTML and CSS spatial stuff because web frontend is such an important application of the models, and those improvements leaked through to SVG as well.

▲

mitkebes an hour ago | parent | prev | next [-]

All models have improved, but from my understanding, Gemini is the main one that was specifically trained on photos/video/etc in addition to text. Other models like earlier chatgpt builds would use plugins to handle anything beyond text, such as using a plugin to convert an image into text so that chatgpt could "see" it.

Gemini was multimodal from the start, and is naturally better at doing tasks that involve pictures/videos/3d spatial logic/etc.

The newer chatgpt models are also now multimodal, which has probably helped with their svg art as well, but I think Gemini still has an edge here

▲

pknerd 3 hours ago | parent | prev | next [-]

> Does anyone understand why LLMs have gotten so good at this?

Added more IF/THEN/ELSE conditions.

	▲	kridsdale3 3 hours ago \| parent [-]
		More wires and jumpers on the breadboard.

▲

2 hours ago | parent | prev [-]

[deleted]

▲ sam_1421 4 hours ago | parent | prev | next [-]

Models are soon going to start benchmaxxing generating SVGs of pelicans on bikes

▲

cbsks 3 hours ago | parent | next [-]

That’s Simon’s goal. “All I’ve ever wanted from life is a genuinely great SVG vector illustration of a pelican riding a bicycle. My dastardly multi-year plan is to trick multiple AI labs into investing vast resources to cheat at my benchmark until I get one.”

https://simonwillison.net/2025/Nov/13/training-for-pelicans-...

	▲	travisgriggs 2 hours ago \| parent [-]
		So once that's achieved, I wonder how well it deals with unsuspected variations. E.g. "Give me an illustration of a bicycle riding by a pelican" "Give me an illustration of a bicycle riding over a pelican" "Give me an illustration of a bicycle riding under a flying pelican" So on and so forth. Or will it start to look like the Studio C sketch about Lobster Bisque: https://youtu.be/A2KCGQhVRTE

▲

embedding-shape 4 hours ago | parent | prev | next [-]

Soon? I'd be willing to bet it's been included in the training set at least 6 months by now. Not so obvious so it generates always perfect pelicans on bikes, but sufficiently for the "minibench" to be less useful today than in the past.

	▲	Rudybega 6 minutes ago \| parent \| next [-]
		If only there were some way to test it, like swapping the two nouns in the sentence. Alas.
	▲	4 hours ago \| parent \| prev [-]
		[deleted]

▲

jsheard 4 hours ago | parent | prev | next [-]

Simons been doing this exact test for nearly 18 months now, if vendors want to benchmaxx it then they've had more than enough time to do so already.

▲

stri8ted 4 hours ago | parent [-]

Exactly. As far as I'm concerned, the benchmark is useless. It's way too easy and rewarding to train on it.

▲

bonoboTP an hour ago | parent | next [-]

It's just an in-joke, he doesn't intend it as a serious benchmark anymore. I think it's funny.

▲

Legend2440 3 hours ago | parent | prev | next [-]

Y'all are way too skeptical, no matter what cool thing AI does you'll make up an excuse for how they must somehow be cheating.

	▲	arcatech 2 hours ago \| parent \| next [-]
		Or maybe you’re too trusting of companies who have already proven to not be trustworthy?
	▲	toraway 2 hours ago \| parent \| prev [-]
		Jeff Dean literally featured it in a tweet announcing the model. Personally it feels absurd to believe they've put absolutely no thought into optimizing this type of SVG output given the disproportionate amount of attention devoted to a specific test for 1 yr+. I wouldn't really even call it "cheating" since it has improved models' ability to generate artistic SVG imagery more broadly but the days of this being an effective way to evaluate a model's "interdisciplinary" visual reasoning abilities have long since passed, IMO. It's become yet another example in the ever growing list of benchmaxxed targets whose original purpose was defeated by teaching to the test. https://x.com/jeffdean/status/2024525132266688757?s=46&t=ZjF...

▲

pixl97 3 hours ago | parent | prev [-]

I mean if you want to make your own benchmark, simply don't make it public and don't do it often. If your salamander on skis or whatever gets better with time it likely has nothing to do with being benchmaxxed.

▲

ks2048 2 hours ago | parent | prev [-]

Forget the paperclip maximizer - AGI will turn the whole world into pelicans on bikes.

▲ SoKamil 3 hours ago | parent | prev | next [-]

It seems they trained the model to output good svg’s.

In their blog post[1], first use case they mention is svg generation. Thus, it might not be any indicator at all anymore.

[1] https://blog.google/innovation-and-ai/models-and-research/ge...

▲ Arcuru 4 hours ago | parent | prev | next [-]

Did you stop using the more detailed prompt? I think you described it here: https://simonwillison.net/2025/Nov/18/gemini-3/

▲

simonw 3 hours ago | parent [-]

It seems to be having capacity problems right now but I'll run that as soon as I can get it to work.

	▲	simonw an hour ago \| parent [-]
		Pretty solid: https://gist.github.com/simonw/f5c893203621a7631ff178d9093a8...

▲ AmazingTurtle 3 hours ago | parent | prev | next [-]

At this point, the pelican benchmark became so widely used that there must be high quality pelicans in the dataset, I presume. What about generating an okapi on a bicycle instead?

	▲	ascorbic an hour ago \| parent \| next [-]
		Loads of examples here https://x.com/jeffdean/status/2024525132266688757
	▲	tromp 3 hours ago \| parent \| prev [-]
		Or, even more challenging, an okapi on a recumbent ?!

▲ WarmWash 4 hours ago | parent | prev | next [-]

Less pretty and more practical, it's really good at outputting circuit designs as SVG schematics.

https://www.svgviewer.dev/s/dEdbH8Sw

▲

InitialLastName 3 hours ago | parent | next [-]

I don't know what of this is the prompt and what was the output, but that's a pretty bad schematic (for both aesthetic and circuit-design reasons).

▲

WarmWash 3 hours ago | parent | next [-]

The prompts were doing the design, reference voltage, hysteresis, output stage, all the maths and then the SVG is from asking the model to take all that and the current BOM to make an SVG schematic of it. In the past models would just output totally incoherent messes of lines and shapes.

I did a larger circuit too that this is part of, but it's not really for sharing online.

▲

svnt 3 hours ago | parent | prev [-]

Yes but you concede it is a schematic.

	▲	tadfisher 2 hours ago \| parent [-]
		How far we have come!

▲

0_____0 4 hours ago | parent | prev [-]

that's pretty amazing for an LLM but as an EE, if my intern did this i would sigh inwardly and pull up some existing schematics for some brief guidance on symbol layout.

▲ steve_adams_86 4 hours ago | parent | prev | next [-]

Ugh, the gears and chain don't mesh and there's no sprocket on the rear hub

But seriously, I can't believe LLMs are able to one-shot a pelican on a bicycle this well. I wouldn't have guessed this was going to emerge as a capability from LLMs 6 years ago. I see why it does now, but... It still amazes me that they're so good at some things.

▲

emp17344 4 hours ago | parent | next [-]

Is this capability “emergent”, or do AI firms specifically target SVG generation in order to improve it? How would we be able to tell?

	▲	steve_adams_86 2 hours ago \| parent \| next [-]
		I asked myself the same thing as I typed that comment, and I'm not sure what the answer is. I don't think models are specifically trained on this (though of course they're trained on how to generate SVGs in general), but I'm prepared to be wrong. I have a feeling the most 'emergent' aspect was that LLMs have generally been able to produce coherent SVG for quite a while, likely without specific training at first. Since then I suspect there has been more tailored training because improvements have been so dramatic. Of course it makes sense that text-based images using very distinct structure and properties could be manipulated reasonably well by a text-based language model, but it's still fascinating to me just how well it can work. Perhaps what's most incredible about it is how versatile human language is, even when it lacks so many dimensions as bits on a machine. Yet it's still cool that we can resurrect those bits at rest and transmogrify them back into coherent projections of photons from a screen. I don't think LLMs are AGI or about to completely flip the world upside down or whatever, but it seems undeniably magical when you break it down.
	▲	simonw 3 hours ago \| parent \| prev [-]
		Google specifically boast about their SVG performance in the announcement post: https://blog.google/innovation-and-ai/models-and-research/ge... You can try any combination of animal on vehicle to confirm that they likely didn't target pelicans directly though.

▲

0_____0 3 hours ago | parent | prev | next [-]

next time you host a party, have people try to draw a bicycle on your whiteboard (you have a whiteboard in your house right? you should, anyway...)

human adults are generally quite bad at drawing them, unless they spend a lot of time actually thinking about bicycles as objects

▲

542354234235 3 hours ago | parent | next [-]

They are, and it is very funny.

https://www.behance.net/gallery/35437979/Velocipedia

	▲	iammattmurphy 2 hours ago \| parent [-]
		Fantastic post, thanks for that.

▲

emp17344 3 hours ago | parent | prev [-]

What’s your point? Yes, humans fail sometimes, as do AI models. Are you trying to imply that, in light of this, AI is now as capable as human beings? If so, that conclusion doesn’t follow logically.

	▲	0_____0 3 hours ago \| parent [-]
		it's not a loaded point, i just think it's funny that humans typically cannot one-shot this. and it will make your friends laugh

▲

HPsquared 4 hours ago | parent | prev [-]

And the left leg is straight while the right leg is bent.

EDIT: And the chain should pass behind the seat stay.

▲ bredren 4 hours ago | parent | prev | next [-]

What is that, a snack in the basket?

▲

sigmar 4 hours ago | parent | next [-]

"integrating a bicycle basket, complete with a fish for the pelican... also ensuring the basket is on top of the bike, and that the fish is correctly positioned with its head up... basket is orange, with a fish inside for fun."

how thoughtful of the ai to include a snack. truly a "thanks for all the fish"

▲

defen 3 hours ago | parent [-]

A pelican already has an integrated snack-holder, though. It wouldn't need to put it in the basket.

	▲	SauntSolaire an hour ago \| parent [-]
		That one's full too

▲

WarmWash 4 hours ago | parent | prev [-]

A fish for the road

▲ tarr11 3 hours ago | parent | prev | next [-]

What do you think this particular prompt is evaluating for?

The more popular these particular evals are, the more likely the model will be trained for them.

	▲	Gander5739 3 hours ago \| parent [-]
		Sea https://simonwillison.net/2025/Nov/13/training-for-pelicans-...

▲ TZubiri an hour ago | parent | prev | next [-]

You think they are able to see their output and iterate on it? Or is it pure token generation?

▲ infthi 4 hours ago | parent | prev | next [-]

Wonder when will we get something other than a side view

	▲	mikepurvis 4 hours ago \| parent [-]
		That would be a especially challenging for vector output. I tried just now on ChatGPT 5.2 to jump straight to an image, with this prompt: "make me a cartoon image of a pelican riding a bicycle, but make it from a front 3/4 view, that is riding toward the viewer." The result was basically a head-on view, but I expect if you then put that back in and said, "take this image and vectorize it as an SVG" you'd have a much better time than trying to one-shot the SVG directly from a description. ... but of course, if that's so, then what's preventing the model from being smart enough to identify this workflow and follow it on its own to get the task completed?

▲ calny 4 hours ago | parent | prev | next [-]

Great pelican but what’s up with that fish in the basket?

▲

coldtea 4 hours ago | parent | next [-]

It's a pelican. What do you expect a pelican to have in his bike's basket?

It's a pretty funny and coherent touch!

▲

embedding-shape 3 hours ago | parent [-]

> What do you expect a pelican to have in his bike's basket?

Probably stuff it cannot fit in the gullet, or don't want there (think trash). I wouldn't expect a pelican to stash fish there, that's for sure.

	▲	kridsdale3 2 hours ago \| parent [-]
		You never travel with a snack fish for later on? He's going to be burning calories.

▲

nicr_22 2 hours ago | parent | prev | next [-]

Yeah, why only _one_ fish?

It's obvious that pelican is riding long distance, no way a single fish is sufficiently energy dense for more than a few miles.

Can't the model do basic math???

▲

gavinray 4 hours ago | parent | prev [-]

Where else are cycling Pelican's meant to keep their fish?

	▲	calny 3 hours ago \| parent [-]
		I get it, I just meant the fish is poorly done, when I’d have guessed it would be relatively simple part. Maybe the black dot eye is misplaced idk.

▲ mohsen1 4 hours ago | parent | prev | next [-]

is there something in your prompt about hats? why the pelican always wearing a hat recently?!

	▲	bigfishrunning 4 hours ago \| parent [-]
		At this point, i think maybe they're training on all of the previous pelicans, and one of them decided to put a hat on it? Disclaimer: This is an unsubstantiated claim that i made up

▲ xnx 4 hours ago | parent | prev | next [-]

Not even animated? This is 2026.

▲

readitalready 4 hours ago | parent [-]

Jeff Dean just posted an animated version: https://x.com/JeffDean/status/2024525132266688757

▲

benbreen 3 hours ago | parent | next [-]

One underrated thing about the recent frontier models, IMO, is that they are obviating the need for image gen as a standalone thing. Opus 4.6 (and apparently 3.1 Pro as well) doesn't have the ability to generate images but it is so good at making SVG that it basically doesn't matter at this point. And the benefit of SVG is that it can be animated and interactive.

I find this fascinating because it literally just happened in the past few months. Up until ~summer of 2025, the SVG these models made was consistently buggy and crude. By December of 2026, I was able to get results like this from Opus 4.5 (Henry James: the RPG, made almost entirely with SVG): https://the-ambassadors.vercel.app

And now it looks like Gemini 3.1 Pro has vaulted past it.

▲

embedding-shape 3 hours ago | parent | next [-]

> doesn't have the ability to generate images but it is so good at making SVG that it basically doesn't matter at this point

Yeah, since the invention of vector images, suddenly no one cares about raster images anymore.

Obviously not true, but that's how your comment reads right now. "Image" is very different from "Image", and one doesn't automagically replace the other.

	▲	buu700 3 hours ago \| parent \| next [-]
		This reminds me of the time I printed a poster with a blown up version of some image for a high school history project. A classmate asked how I did it, so I started going on about how I used software to vectorize the image. Turned out he didn't care about any of that and just wanted the name of the print shop.
	▲	Der_Einzige 2 hours ago \| parent \| prev [-]
		You have no idea how badly I want to be teleported to the alternative world where VECTOR COMPUTING was the dominant form of computers. We had high framerate (yes it was variable), bright, beautiful displays in the 1980s with the vectrex.

▲

cachius 3 hours ago | parent | prev [-]

2025 that is

▲

bigfishrunning 4 hours ago | parent | prev [-]

That Ostrich Tho

	▲	cachius 3 hours ago \| parent [-]
		That Tires Tho

▲ DonHopkins 2 hours ago | parent | prev | next [-]

How about STL files for 3d printing pelicans!

	▲	baq 2 hours ago \| parent [-]
		Harder: the bike must work Hardest: the pelican must work

▲ benatkin 3 hours ago | parent | prev | next [-]

I used the AI studio link and tried running it with the temperature set to 1.75: https://jsbin.com/locodaqovu/edit?html,output

▲ saberience 4 hours ago | parent | prev [-]

I hope we keep beating this dead horse some more, I'm still not tired of it.