Remix.run Logo
zahlman 4 days ago

> The interesting question to me is: what happens when AI can not only implement but also playtest -- running thousands of iterations of your loop, surfacing which mechanics keep simulated players engaged?

How is AI supposed to simulate a player, and why should it be able to determine what real people would find engaging?

yonatan8070 4 days ago | parent | next [-]

Game companies already collect heaps of data about players, which mechanics they interact with, which mechanics they don't, retention, play time, etc.

I don't think it's much of a stretch to take this data over multiple games, versions, and genres, and train a model to take in a set of mechanics, stats, or even video and audio to rate the different aspects of a game prototype.

I wouldn't even be surprised if I heard this is already being done somewhere.

uncircle 3 days ago | parent | next [-]

> Game companies already collect heaps of data about players, which mechanics they interact with, which mechanics they don't, retention, play time, etc.

Yes, that's how games like Concord get made. Very successful approach to create art based on data about what's popular and focus groups.

georgeecollins 3 days ago | parent | next [-]

I think you are saying data is no substitute for vision in design. Completely agree! At Playdom (Disney) they tried to build a game once from the ground up based on A/B testing. Do you know what that game was? No you don't because it was never released and terrible.

I think what the previous comment meant was that there is data on how player play, and that tends to be varied but more predictable.

mlyle 3 days ago | parent [-]

Yah. I think an AI playtester that could go "hey... this itch that lots of players seem to have doesn't get scratched often in your main gameplay loop" or "there's a valley 1/3rd of the way into the game where progression slows way down" or "that third boss is way too hard".

AI/fuzzers can't get far enough in games, yet, without a lot of help. But I think that's because we don't have models really well suited for them.

theshrike79 2 days ago | parent | prev | next [-]

Data is the lifeblood of mobile gaming, everything is data-driven.

Everything is measured and analysed and optimised for engagement and monetisation.

When you have 200 people making a game, "luck" or "art" doesn't factor in at all. You test, get data, and make decisions based on the data, not feelings.

Solo devs can still make artsy games and stumble upon success.

MangoToupe 3 days ago | parent | prev [-]

Isn't Concord massively unpopular? I'd think that's a terrible example

Edit: yup, it shut down nearly a year ago

SpecialistK 3 days ago | parent [-]

I think it was a sarcastic example - in other words, all the data and metrics and trend-chasing in the world is not a replacement for human vision, creativity, and risk-taking.

fluoridation 3 days ago | parent [-]

Was Concord made the way it was because of data? I got the impression that the designers were chasing misguided trends with the art direction, and on top of that the game part was just mediocre.

SpecialistK 3 days ago | parent [-]

I can't say for sure (never played it or followed it much, because it's not my type of game) but the impression I had is that it was a cookie-cutter attempt to be just another live service online shooter in the vein of Valorant, Overwatch, Apex Legends, etc etc. And people saw no need to play this new one when those games already exist.

Compare that to Helldivers 2 (online-only live service game, same platforms and publisher) which had a lot of personality (the heavy Starship Troopers movie vibe) and some unique gameplay elements like the strategems.

Cthulhu_ 3 days ago | parent [-]

To add, Concord had been in development for eight years at that point, had multiple leadership and direction changes, and then the studio was aquired by Sony because they wanted more big live service games and this game ticked all the boxes and was nearly done. So more money was pumped into it.

And sometimes it works; Apex Legends came out of nowhere and became one of the big live service titles. Fortnite did a battle royale mode out of nowhere and became huge.

sbarre 4 days ago | parent | prev | next [-]

Yeah there's no way Microsoft isn't already using all their aggregate metrics (trillions of data points I'm sure) from their first-party studios and making a "What good looks like" training set..

Whether that set is actually useful is a separate issue but someone is trying this over there for sure.

georgeecollins 3 days ago | parent | prev | next [-]

We did that on a game I worked on over ten years ago. It was a mobile game and we knew that it was very important to player retention (and interest in multiplayer) to have the first multiplayer interaction be "fun". So we would simulate the first person you played against as though they were another human. Based on play data of other humans. Because you only played them once you didn't think you were playing a bot.

Where we used AI (machine learning, not LLM) was in terms trying to figure out what kind of human you would want to play with. We also used machine learning to try figure out what cohort of players you were in so we could tweak engagement.

Where LLMs could really shine, in my opinion: Gamers love to play people, not AI (now). People are unpredictable, they communicate, they play well but in ways a human could (like they don't have superhuman reflexes or speed). You can play all kinds of games against AI (StarCraft, Civilization, training of all kinds of FPS) but it isn't fun for long because you see the robotic patterns. However, an LLM might be able to mix it up like humans, talk to you, and you could probably make it have imperfect reaction time, coordination, etc. That would really help a lot of games that have lulls in human player activity, or too much toxicity.

I would be shocked if some games aren't doing this now. It seems like it still be hard to make a bot seem human, and it probably only works if you sprinkle it in.

ryoshu 3 days ago | parent | next [-]

Humans prefer humans over bots in multiplayer. Even if you dumb down LLM-powered-bots, there's no sense of accomplishment on beating a bot that can be dialed up-or-down. And the social aspect... maybe some amount of gamers want to talk to bots instead of humans in a pvp match. Curious on the numbers there.

Mouvelie 3 days ago | parent | prev [-]

Could never prove it, but would bet money that Marvel Snap for example is doing it right now.

Edit : oh yeah. A quick google search proved it : https://marvelsnapzone.com/bots/

tialaramex 3 days ago | parent | prev [-]

Ah yes, the huge game companies, definitely outfits I would associate with producing fun games I haven't seen before and not churning out Existing Franchise N+1 every year with barely perceptible differences and higher prices each iteration.

yonatan8070 3 days ago | parent [-]

Maybe "fun" isn't the right word, "engaging" or "addicting" is probably what they use internally.

AlienRobot 4 days ago | parent | prev | next [-]

Game developers will try anything before they actually write automated tests for their games.

nine_k 4 days ago | parent | next [-]

When you tweak game mechanics several times every day, keeping the tests useful is a large task. Basics can be tested. Map integrity can be tested. Most "normal UX" is hard to test, and even main functional tests tend to drift. (Source: a short involvement in actual gamedev recently.)

greesil 4 days ago | parent | next [-]

One can still write unit tests. I have been told from a couple different game devs that it's more because of release deadlines, and the cost of a bug is usually pretty small.

pton_xd 3 days ago | parent | next [-]

There are some game systems that lend themselves to unit testing, like say map generation to ensure that the expected landmarks are placed reasonably, or rooms are connected, or whatever. But most game interactions are just not easily "unit testable" since they happen across frames (eg over time). How would you unit test an enemy that spawns, moves towards the player, and attacks?

I'm sure you could conjure up any number of ways to do that, but they won't be trivial, and maintaining those tests while you iterate will only slow you down. And what's the point? Even if the unit-move-and-attack test passes, it's not going to tell you if it looks good, or if it's fun.

Ultimately you just have to play the game, constantly, to make sure the interactions are fun and working as you expect.

coderenegade 3 days ago | parent | next [-]

It would depend on how things are architected, but you could definitely test the components of your example in isolation (e.g. spawn test, get the movement vector in response to an enemy within a certain proximity, test that the state is set to attacking, whatever that looks like). I don't disagree that it's a hard problem. I run into similar issues with systems that use ML as some part of their core, and I've never come up with a satisfying solution. My strategy these days is to test the things that it makes sense to test, and accept that for some things (especially dynamic behavior of the system) you just have to use it and test it that way.

chaps 3 days ago | parent | prev | next [-]

> How would you unit test an enemy that spawns, moves towards the player, and attacks?

You use a second enemy that spawns, moves towards the "enemy", and attacks.

cherryteastain 3 days ago | parent | prev [-]

> How would you unit test an enemy that spawns, moves towards the player, and attacks?

You can easily write a 'simulation' version of your event loop and dependency inject that. Once time can be simulated, any deterministic interaction can be unit tested.

mac-mc 3 days ago | parent | next [-]

Others would quibble that those are integration tests, "UI" tests, or other higher-level tests, etc.

9rx 3 days ago | parent [-]

Which is all the same as what unit test was originally defined as.

You're right that "unit test" has taken on another, rather bizarre definition in the intervening years that doesn't reflect any kind of tests anyone actually writes in the real world, save where they are trying to write "unit tests" specifically to please the bizarre definition, but anyone concerned about definitional purity enough to quibble about it will use the original definition anyway...

ryoshu 3 days ago | parent | prev [-]

A lot of games aren't deterministic within a scope of reasonable test coverage.

greesil 3 days ago | parent [-]

Set the same seed for the test?

eru 3 days ago | parent | prev [-]

> and the cost of a bug is usually pretty small.

Like letting speed runners skip half your game. :)

snovv_crash 3 days ago | parent | prev [-]

I've heard the same excuses from ML engineers before introducing tests there, embedded engineers, robotics engineers, systems engineers, everyone has a reason.

The real reason? It's because writing tests is a different skill and they don't actually know how to do it.

peterashford 3 days ago | parent [-]

Oh that's crap. I've been a software engineer for over 30 years. I love tests - I preach testing at my current place of work. I've also worked in games for about a decade. Testing in games is... not useless, but very much less useful than it is in general software engineering.

peterashford 3 days ago | parent | prev | next [-]

The problem with tests for games is that a lot of game code is in constant flux. A test suite introduces a not insignificant amount of rigidity to your codebase. Pivot a few concepts and you have dozens of tests to fix - or just invalidate entirely. Very basic stuff that won't ever change can be tested - like whether the renderer is working properly - but that's never where the difficulty in game dev lies and its the stuff usually handled by a third party - library or engine.

KronisLV 3 days ago | parent [-]

> The problem with tests for games is that a lot of game code is in constant flux. A test suite introduces a not insignificant amount of rigidity to your codebase. Pivot a few concepts and you have dozens of tests to fix - or just invalidate entirely.

Sounds very much like the description of a big ball of mud.

An interesting gamedev video I saw recently basically boiled down to: "Build systems, not games." It was aimed at indie devs to help with the issue of always chasing new projects and making code that's modular enough to be able to reuse it.

But taking a step back, that very much feels like it should apply to entire games, where you should have boundaries between the components and so that the scope of any such pivot is managed well enough not to tank your velocity.

Other than that, it'd be just the regular growing pains of TDD or even just needing to manage good test coverage - saying that tests will eventually need changes isn't the best argument against them in webdev, nor should it be anywhere else.

bccdee 3 days ago | parent [-]

> Sounds very much like the description of a big ball of mud.

I mean, yeah, kinda.

For any given object in the game world, it's funnest for that object to be able to interact with as many other objects as possible in as many ways as possible. A game object's handles for interaction need to be globally available and can't impose many invariants—especially if you don't want level designers to have to be constantly re-architecting the engine code to punch new holes for themselves in the API. Thus, a lot of the logic in a given level tends to live inside the callback hooks of level objects, and tends to depend on the state of the rest of the level for correctness.

Modularity is a property of high cohesion and low coupling, which are themselves only possible when you can pin down your design and hide information behind abstraction boundaries. But games are a flexible and dynamic enough field that engines have to basically let designers do whatever they want, whenever they want in order for the engine to be able to build arbitrary games. So game design is naturally a highly-coupled, incohesive problem space that is poorly suited to unit testing.

KronisLV 2 days ago | parent [-]

> So game design is naturally a highly-coupled, incohesive problem space that is poorly suited to unit testing.

Poorly suited? Perhaps, but so are certain web system architectures as well, neither is impossible to test.

I think Factorio is an example that it can be done if you care about it... it's just that most studios shipping games don't.

https://www.factorio.com/blog/post/fff-438

https://www.factorio.com/blog/post/fff-366

Of course, in their case it can actually be justified, because the game itself is very dependent on the logic working correctly, rather than your typical FPS game slop that just needs to look good.

bccdee 2 days ago | parent [-]

Yeah I suspect Factorio's "complex game logic + simple(ish) 2d engine + minimal team structure" situation meant that the usual tradeoffs didn't apply. It's really cool that they pulled it off, though—I can't imagine it was easy, even then.

somat 3 days ago | parent | prev | next [-]

As a counter example I found this video essay about fixing a factorio bug fascinating. My main takeaway, I need better introspection hooks. I am not really programmer and I never really thought automated testing of user interactive parts was possible.

https://www.youtube.com/watch?v=AmliviVGX8Q (kovarex - Factorio lets fix video #1)

skocznymroczny 3 days ago | parent | prev [-]

League of Legends does a lot of automated testing for their gameplay logic https://technology.riotgames.com/news/automated-testing-leag...

eru 3 days ago | parent | prev | next [-]

> How is AI supposed to simulate a player, and why should it be able to determine what real people would find engaging?

Games have goals, and players are prone to 'optimising the fun out of games', by doing some save strategy over and over again to reach that goal, even if it's not fun. Think eg grinding in an RPG, instead of facing tough battles with strategy and wits and the risk of failure.

Even if AIs are terrible at determining what's engaging, you can probably at least use them to relatively quickly find ways that you accidentally opened that let players get in the way of their own fun.

mzl 3 days ago | parent | prev | next [-]

I've heard several talks about how some companies make AI systems that are designed to play as similar to human players as possible. This has been crucial for them in order to play-test levels in order to balance the game.

And note, this is not AI as in asking an LLM what to do, this is more classical machine learning and deep learning.

gmadsen 4 days ago | parent | prev | next [-]

because it has millions of examples of that in its training data?

bozhark 3 days ago | parent | prev [-]

Make the same engagement metric as people do; try to break it