Interesting but frustratingly vague on details. How exactly are the models playing? Is it using some kind of PGN equivalent in Tetris that represents a on-going game, passing an ASCII representation, encoding as a JSON structure, or just directly sending screenshots of the game to the various LLMs?

▲

storystarling 14 hours ago | parent | next [-]

It has to be turn-based. Even with Flash's speed, the inference latency would kill you in a real-time loop. They're likely pausing the game state after every tick to wait for the API response before resuming.

▲

ykhli 13 hours ago | parent | prev [-]

answered this in a comment above! It's not turn or visual layout based since LLMs are not trained that way. The representation is a JSON structure, but LLMs plug in algorithms and keeps optimizing it as the game state evolves

▲

storystarling 2 hours ago | parent | next [-]

Curious how the token economics compare here to a standard agent loop. It seems like if you're using the LLM as a JIT to optimize the algorithm as the game evolves, the context accumulation would get expensive fast even with Flash pricing.

▲

vunderba 12 hours ago | parent | prev | next [-]

Thanks for the clarification! Kind of reminds me of the Brian Moore's AI clocks which uses several LLMs to continuously generate HTML/CSS to create an analog clock for comparisons.

https://clocks.brianmoore.com

	▲	ykhli 12 hours ago \| parent [-]
		Wow this is incredible!!

▲

mhh__ 12 hours ago | parent | prev [-]

I suppose you could argue about whether it's an LLM at that point but vision is a huge part of frontier models now, no?