Remix.run Logo
Optimizing our way through Metroid(antithesis.com)
146 points by eatonphil 6 days ago | 33 comments
Dwedit 4 days ago | parent | next [-]

Did anyone else get reminded of the Lexographic Ordering Solver that played NES games? This was featuring in Sigbovik 2013.

http://tom7.org/mario/

Video: https://www.youtube.com/watch?v=xOCurBYI_gY

IAmLiterallyAB 4 days ago | parent [-]

It did remind me of that. Tom7 is a treasure

o11c 5 days ago | parent | prev | next [-]

Hmm, scrolling lags even without javascript (Firefox ESR, Linux). Last time I saw this I think they fix was something about gradients/blur?

There's also some kind of weird input-capture stopping keyboard scrolling at first, and the video player is some weird thing I can't see how to make work.

o11c 4 days ago | parent [-]

Adding 1 CSS rule gets rid of the slowness:

  background-color: black !important;
I'm not sure which specific one is to blame, but there is a lot of transparency in various colors, both foreground and background.
terpimost 3 days ago | parent [-]

Thank you very much for a solution. We will investigate the issue.

bumbledraven 5 days ago | parent | prev | next [-]

It would be neat if a fuzzer could help set a new tool-assisted speedrun (TAS) record.

wwilson 5 days ago | parent | next [-]

Yes, this is a really fun idea and something that we want to do. Though these days we’re setting our sights higher than Nintendo…

A funny story though: a regular conference gimmick we have is “Man vs. Machine” where we have attendees race our fuzzer to the end of Mario level 1-1. We did this at the final year of Strange Loop, and the fuzzer was winning handily until not one, not two, but three different professional speedrunners walked by and destroyed us.

NobodyNada 4 days ago | parent | prev [-]

There have definitely been some applications of this sort of thing to speedrunning -- though far less sophisticated than the approach here, and usually only testing against a very small subset of the game. I've heard of some of this kind of work being done before on e.g. SM64.

I've also done something along these lines myself in Super Metroid. Mother Brain's neck moves in a conceptually simple but very chaotic pattern influenced by Samus's vertical movement, and there's a cutscene during the fight where the positioning of her neck can make a difference of about 7 seconds. The TAS fight used complicated movement to manipulate her neck position developed through much trial-and-error, while the best known human-viable manips were several seconds slower.

I wrote a program to search the state space for optimal movement patterns, and working with some speedrunners we were able to come up with a new human-viable manipulation that matched the previous TAS fight, as well as a new TAS manipulation that saved an additional 41 frames.

https://youtu.be/7SHD9L_Jx5Q

https://github.com/NobodyNada/mbsim

cout 4 days ago | parent [-]

Very impressive! I had wondered where that MB manip came from. No surprise at all that it was you. :)

jboggan 5 days ago | parent | prev | next [-]

Fantastic read and a really interesting company I did not know about until just now.

I would love to see how it handles Castlevania II.

wwilson 5 days ago | parent | next [-]

Haven’t tried Castlevania II, but here’s the first one: https://antithesis.com/blog/castlevania/

AIPedant 5 days ago | parent [-]

This seems like a cool company and I don't want to nitpick too much, but gamers have no respect for history:

  Castlevania... [so] called because it is a Metroidvania game set in a Castle.
Ouch - this is precisely backwards. Metroidvanias are named after Metroid and Castlevania because those series practically defined the genre.

Also a bit frustrating because the first Castlevania itself isn't actually a metroidvania, it's a more conventional action-platformer. Castlevania II has non-linear exploration, lots of items to collect, and puzzle-solving, all like Metroid. So it's not too surprising Antithesis had to do a lot of work for adapting their system to Metroid - but I wonder if this work means it now can handle Castlevania II without much extra development.

wwilson 5 days ago | parent | next [-]

You were successfully trolled. :-)

houky 3 days ago | parent | prev [-]

This is correct. Also, Metroid is called Metroid because it is a Metroidvania set not in Romania, but on an alien world.

tyleo 5 days ago | parent | prev | next [-]

Yeah the company sounds interesting. I wish the main page had clearer info about what it does. There’s a lot of text but I want the simple, “here’s the little bit of example code to get going.”

tyleo 5 days ago | parent | next [-]

After a little more digging I found some very cool answers in the docs: https://antithesis.com/docs/

Aerbil313 5 days ago | parent | prev [-]

I assume they are intentionally not very vocal, probably still maturing/scaling their platform. Until recently they were a stealth startup. The stuff they are doing is truly revolutionary.

gblargg 5 days ago | parent | prev [-]

> I would love to see how it handles Castlevania II.

I assume you're thinking specifically of using the red crystal to spawn a tornado: https://youtu.be/Mx9PwRIK9Io

Taikonerd 4 days ago | parent | prev | next [-]

They've done a series of these NES-themed demos of their fuzzer.

What's neat is that they're not just mechanically applying the same techniques to new games! Each game has been harder to fuzz (larger state space, implicit constraints in gameplay, etc). So they keep inventing new techniques.

__s 5 days ago | parent | prev | next [-]

Would expect some route optimization, there's spots where it bomb hops around corridor before proceeding. Seems like it could see running straight through would result in same game state sooner

But I'm probably viewing this from TAS perspective instead of fuzzer perspective

wwilson 5 days ago | parent [-]

The longer you run it, the cleaner the run gets. But Metroid is a very compute-intensive game to fuzz, and we were already nearing the limits of what BigQuery could do for us with that run.

throwaway77770 5 days ago | parent | prev | next [-]

For some reason, the embedded videos seem to break in Firefox Private Browsing (128esr). This had me stumped for a while until I tried it in a normal not-private window and it worked.

qrush 5 days ago | parent | next [-]

Curious - what OS? (I work at Wistia!)

TapamN 4 days ago | parent | next [-]

I'm using Firefox 139.0.4 canonical-002 Snap on Xubuntu, and the videos don't play for me. Even when not using private browsing, even when I disable uBlock Origin, even when I disable Privacy Badger (and, of course, I've set NoScript to enable JS for the tab.)

fleebee 4 days ago | parent [-]

Do you have tracking protection on "Strict"? The player only started working for me after changing it to "Standard".

TapamN 3 days ago | parent [-]

Oh, yeah, I am. I forgot about that setting. Switching Standard allows the videos to work for me, while Privacy Badger and uBlock Origin are enabled.

throwaway77770 4 days ago | parent | prev [-]

Linux - Debian 12 to be precise.

5 days ago | parent | prev [-]
[deleted]
fcubed 4 days ago | parent | prev | next [-]

this is amazing! is just 'simple' fuzzing (out)performing things like (deep)RL-agents?

wwilson 4 days ago | parent [-]

It’s not quite a fair comparison, since an RL agent is trying to learn a policy that wins fair and square, while a fuzzer is able to take back moves. But if you’re working in a domain (like anything that can be simulated) where “time travel” is possible, you’d have to be crazy not to use it!

jonny_eh 5 days ago | parent | prev [-]

What’s antithesis? Consider that every blog article you write may be the reader’s first exposure to your company/project.

tyleo 5 days ago | parent [-]

I thought the same thing. They are quite verbose in explaining themselves but I found their docs to be useful.

https://antithesis.com/docs/

fuckaj 4 days ago | parent [-]

Property based fuzz testing in the cloud? (As an approximation?)