Remix.run Logo
astlouis44 8 hours ago

A playable 3D dungeon arena prototype built with Codex and GPT models. Codex handled the game architecture, TypeScript/Three.js implementation, combat systems, enemy encounters, HUD feedback, and GPT‑generated environment textures. Character models, character textures, and animations were created with third-party asset-generation tools

The game that this prompt generated looks pretty decent visually. A big part of this likely due to the fact the meshes were created using a seperate tool (probably meshy, tripo.ai, or similiar) and not generated by 5.5 itself.

It really seems like we could be at the dawn of a new era similiar to flash, where any gamer or hobbyist can generate game concepts quickly and instantly publish them to the web. Three.js in particular is really picking up as the primary way to design games with AI, in spite of the fact it's not even a game engine, just a web rendering library.

0x62 7 hours ago | parent | next [-]

FWIW I've been experimenting with Three.js and AI for the last ~3 years, and noticed a significant improvement in 5.4 - the biggest single generation leap for Three.js specifically. It was most evident in shaders (GLSL), but also apparent in structuring of Three.js scenes across multiple pages/components.

It still struggles to create shaders from scratch, but is now pretty adequate at editing existing shaders.

In 5.2 and below, GPT really struggled with "one canvas, multiple page" experiences, where a single background canvas is kept rendered over routes. In 5.4, it still takes a bit of hand-holding and frequent refactor/optimisation prompts, but is a lot more capable.

Excited to test 5.5 and see how it is in practice.

CSMastermind 7 hours ago | parent | next [-]

> It still struggles to create shaders from scratch

Oh just like a real developer

accrual 6 hours ago | parent [-]

Much respect for shader developers, it's a different way of thinking/programming

Pym 4 hours ago | parent | prev | next [-]

One struggle I'm having (with Claude) is that most of what it knows about Three.js is outdated. I haven't used GPT in a while, is the grass greener?

Have you tried any skills like cloudai-x/threejs-skills that help with that? Or built your own?

import 4 hours ago | parent | prev [-]

Using Claude for the same context and it’s doing really well with the glsl. since like last September

vunderba 7 hours ago | parent | prev | next [-]

I’ve had a lot of success using LLMs to help with my Three.js based games and projects. Many of my weird clock visualizations relied heavily on it.

It might not be a game engine, but it’s the de facto standard for doing WebGL 3D. And since it’s been around forever, there’s a massive amount of training data available for it.

Before LLMs were a thing, I relied more on Babylon.js, since it’s a bit higher level and gives you more batteries included for game development.

mindhunter 4 hours ago | parent | prev | next [-]

A friend is building Jamboree[1] (prev name "Spielwerk") for iOS. An app to build and share games. They're all web based so they're easy to share.

[1] https://apps.apple.com/uz/app/jamboree-game-maker/id67473110...

dataviz1000 6 hours ago | parent | prev | next [-]

LLM models can not do spacial reasoning. I haven't tried with GPT, however, Claude can not solve a Rubik Cube no matter how much I try with prompt engineering. I got Opus 4.6 to get ~70% of the puzzle solved but it got stuck. At $20 a run it prohibitively expensive.

The point is if we can prompt an LLM to reason about 3 dimensions, we likely will be able to apply that to math problems which it isn't able to solve currently.

I should release my Rubiks Cube MCP server with the challenge to see if someone can write a prompt to solve a Rubik's Cube.

embedding-shape 4 hours ago | parent | next [-]

> I should release my Rubiks Cube MCP server with the challenge to see if someone can write a prompt to solve a Rubik's Cube.

Do it, I'm game! You nerdsniped me immediately and my brain went "That sounds easy, I'm sure I could do that in a night" so I'm surely not alone in being almost triggered by what you wrote. I bet I could even do it with a local model!

Melatonic 3 hours ago | parent | prev | next [-]

What about a model designed for robotics and vision? Seems like an LLM trained on text would inherently not be great for this.

DeepMinds other models however might do better?

snet0 5 hours ago | parent | prev | next [-]

How are you handing the cube state to the model?

dataviz1000 4 hours ago | parent | next [-]

Does this answer the question?

Opus 4.6 got the cross and started to get several pieces on the correct faces. It couldn't reason past this. You can see the prompts and all the turn messages.

https://gist.github.com/adam-s/b343a6077dd2f647020ccacea4140...

edit: I can't reply to message below. The point isn't can we solve a Rubik's Cube with a python script and tool calls. The point is can we get an LLM to reason about moving things in 3 dimensions. The prompt is a puzzle in the way that a Rubik's Cube is a puzzle. A 7 year old child can learn 6 moves and figure out how to solve a Rubik's Cube in a weekend, the LLM can't solve it. However, can, given the correct prompt, a LLM solve it? The prompt is the puzzle. That is why it is fun and interesting. Plus, it is a spatial problem so if we solve that we solve a massive class of problems including huge swathes of mathematics the LLMs can't touch yet.

osti 4 hours ago | parent [-]

Can't they write a script to solve rubik cubes?

4 hours ago | parent | prev [-]
[deleted]
Torkel 4 hours ago | parent | prev [-]

*yet

kingstnap 7 hours ago | parent | prev | next [-]

The meshes look interesting, but the gameplay is very basic. The tank one seems more sophisticated with the flying ships and whatnot.

What's strange is that this Pietro Schirano dude seems to write incredibly cargo cult prompts.

  Game created by Pietro Schirano, CEO of MagicPath

  Prompt: Create a 3D game using three.js. It should be a UFO shooter where I control a tank and shoot down UFOs flying overhead.
  - Think step by step, take a deep breath. Repeat the question back before answering.
  - Imagine you're writing an instruction message for a junior developer who's going to go build this. Can you write something extremely clear and specific for them, including which files they should look at for the change and which ones need to be fixed?
  -Then write all the code. Make the game low-poly but beautiful.
  - Remember, you are an agent: please keep going until the user's query is completely resolved before ending your turn and yielding back to the user. Decompose the user's query into all required sub-requests and confirm that each one is completed. Do not stop after completing only part of the request. Only terminate your turn when you are sure the problem is solved. You must be prepared to answer multiple queries and only finish the call once the user has confirmed they're done.
  - You must plan extensively in accordance with the workflow steps before making subsequent function calls, and reflect extensively on the outcomes of each function call, ensuring the user's query and related sub-requests are completely resolved.
torginus 6 hours ago | parent | next [-]

It's weird how people pep talk the AI - if my Jira tickets looked like this, I would throw a fit.

I guess these people think they have special prompt engineering skills, and doing it like this is better than giving the AI a dry list of requirements (fwiw, they might be even right)

mattgreenrocks 6 hours ago | parent | next [-]

It’s not surprising to me that the same crowd that cheers for the demise of software engineering skills invented its own notion of AI prompting skills.

Too bad they can veer sharply into cringe territory pretty fast: “as an accomplished Senior Principal Engineer at a FAANG with 22 years of experience, create a todo list app.” It’s like interactive fanfiction.

dr_kiszonka 9 minutes ago | parent | next [-]

That's quite similar to the AI Studio's prompt. You are a world-class frontend engineer...

eiksjs 5 hours ago | parent | prev [-]

Indeed it is so utterly cringe.

eloisant 4 hours ago | parent | prev [-]

Yes, this is cargo cult.

This remind me of so called "optimization" hacks that people keep applying years after their languages get improved to make them unnecessary or even harmful.

Maybe at one point it helped to write prompts in this weird way, but with all the progress going on both in the models and the harness if it's not obsolete yet it will soon be. Just crufts that consumes tokens and fills the context window for nothing.

skirano 6 hours ago | parent | prev | next [-]

Pietro here, I just published a video of it: https://x.com/skirano/status/2047403025094905964?s=20

irthomasthomas 7 hours ago | parent | prev | next [-]

> Think Step By Step

What is this, 2023?

I feel like this was generated by a model tapping in to 2023 notions of prompt engineering.

retr0rocket 7 hours ago | parent [-]

[dead]

tantalor 7 hours ago | parent | prev | next [-]

It comes across as an elaborate, sparkly motivational cat poster.

*BELIEVE!* https://www.youtube.com/watch?v=D2CRtES2K3E

skolskoly 3 hours ago | parent [-]

https://m.media-amazon.com/images/I/71MTbRmLY8L._AC_UF894,10...

bredren 7 hours ago | parent | prev | next [-]

The prompt did not specify advanced gameplay.

I do not see instructions to assist in task decomposition and agent ~"motivation" to stay aligned over long periods as cargo culting.

See up thread for anecdotes [1].

> Decompose the user's query into all required sub-requests and confirm that each one is completed. Do not stop after completing only part of the request. Only terminate your turn when you are sure the problem is solved.

I see this as a portrayal of the strength of 5.5, since it suggests the ability to be assigned this clearly important role to ~one shot requests like this.

I've been using a cli-ai-first task tool I wrote to process complex "parent" or "umberella" into decomposed subtasks and then execute on them.

This has allowed my workflows to float above the ups and downs of model performance.

That said, having the AI do the planning for a big request like this internally is not good outside a demo.

Because, you want the planning of the AI to be part of the historical context and available for forensics due to stalls, unwound details or other unexpected issues at any point along the way.

[1] https://news.ycombinator.com/item?id=47879819

ahoka 6 hours ago | parent | prev [-]

"take a deep breath"

OMFG

peder 2 hours ago | parent | prev | next [-]

> It really seems like we could be at the dawn of a new era similiar to flash

We've been there for a while.... creativity has been the primary bottleneck

7 hours ago | parent | prev | next [-]
[deleted]
7 hours ago | parent | prev | next [-]
[deleted]
nemo44x 4 hours ago | parent | prev | next [-]

It’s like all these things though - it’s not a real production worthy product. It’s a super-demo. It looks amazing until you realize there’s many months of work to make it something of quality and value.

I think people are starting to catch on to where we really are right now. Future models will be better but we are entering a trough of dissolution and this attitude will be widespread in a few months.

ZeWaka 8 hours ago | parent | prev | next [-]

I personally don't think the gameplay itself is that impressive.

gregpred 8 hours ago | parent | prev [-]

[flagged]