Remix.run Logo
Animats 4 days ago

> Bindless is pretty much _the_ most important feature we need in WebGPU. Other stuff can be worked around to varying degrees of success, but lack of bindless makes our state changes extremely frequent, which heavily kills performance with how expensive WebGPU makes changing state.

Yes.

This has had a devastating effect on Rust 3D graphics. The main crate for doing 3D graphics in Rust is WGPU. WGPU supports not just WebGPU, but Android, Vulkan, Metal, Direct-X 12, and OpenGL. It makes them all look much like Vulkan. Bevy, Rend3, and Renderling, the next level up, all use WGPU. It's so convenient.

WGPU has lowest common denominator support. If WebGPU can't do something inside a browser, then WGPU probably can't do it on other platforms which could handle it. So WGPU makes your gamer PC perform like a browser or a phone. No bindless, no multiple queues, and somewhat inefficient binding and allocation.

This is one reason we don't see high-performance games written in Rust.

After four years of development, WGPU performance has gone down, not up. When it dropped 21% recently and I pointed that out, some people were very annoyed.[1]

Google pushing bindless forward might help get this unstuck. Although notice that the target date on their whiteboard is December 2026. I'm not sure that game dev in Rust has that much runway left. Three major projects have been cancelled and the main site for Rust game dev stopped updating in June 2024.[2]

[1] https://github.com/gfx-rs/wgpu/issues/6434

[2] https://gamedev.rs/

jms55 4 days ago | parent | next [-]

> This is one reason we don't see high-performance games written in Rust.

Rendering is _hard_, and Rust is an uncommon toolchain in the gamedev industry. I don't think wgpu has much to do with it. Vulkan via ash and DirectX12 via windows-rs are both great options in Rust.

> After four years of development, WGPU performance has gone down, not up. When it dropped 21% recently and I pointed that out, some people were very annoyed.[1]

Performance isn't most of the wgpu maintainer's (who are paid by Mozilla) priority at the moment. Fixing bugs and implementing missing features so that they can ship WebGPU support in Firefox is more important. The other maintainers are volunteers with no obligation besides finding it enjoyable to work on. Performance can always be improved later, but getting working WebGPU support to users so that websites can start targeting it is crucial. The annoyance is that you were rude about it.

> Google pushing bindless forward might help get this unstuck. Although notice that the target date on their whiteboard is December 2026.

The bindless stuff is basically "developers requested it a ton when we asked for feedback on features they wanted (I was one of those people who gave them feedback), and we had some draft proposals from (iirc) 1-2 different people". It's wanted, but there are still major questions to answer. It's not like this is a set thing they've been developing and are preparing to release. All the features listed are just feedback from users and discussion that took place at the WebGPU face to face recently.

jblandy 4 days ago | parent | next [-]

WGPU dev here. I agree with everything JMS55 says here, but I want to avoid a potential misunderstanding. Performance is definitely a priority for WGPU, the open source project. Much of WGPU's audience is very concerned with performance.

My team at Mozilla are active contributors to WGPU. For the moment, when we Mozilla engineers are prioritizing our own work, we are focused on compatibility and safety, because that's what we need most urgently for our use case. Once we have shipped WebGPU in Firefox, we will start putting our efforts into other things like performance, developer experience, and so on.

But WGPU has other contributors with other priorities. For example, WGPU just merged some additions to its nascent ray tracing support. That's not a Mozilla priority, but WGPU took the PR. Similarly for some recent extensions to 64-bit atomics (which I think is used by Bevy for Nanite-like techniques?), and other areas.

WGPU is an open source project. We at Mozilla contribute to the features we need; other people contribute to what they care about; and the overall direction of the project is determined by what capable contributors put in the time to make happen.

jms55 4 days ago | parent [-]

> But WGPU has other contributors with other priorities. For example, WGPU just merged some additions to its nascent ray tracing support. That's not a Mozilla priority, but WGPU took the PR. Similarly for some recent extensions to 64-bit atomics (which I think is used by Bevy for Nanite-like techniques?), and other areas.

Yep! The 64-bit atomic stuff let me implement software rasterization for our Nanite-like renderer - it was a huge win. Same for raytracing, I'm using it to develop a RT DI/GI solution for Bevy. Both were really exciting additions.

The question of how performant and featureful wgpu is is mostly just a matter of resources in my view. Like with Bevy, it's up to contributors. The unfortunate reality is that if I'm busy working on Bevy, I don't have any time for wgpu. So I'm thankful for the people who _do_ put in time to wgpu, so that I can continue to improve Bevy.

Animats 3 days ago | parent | prev | next [-]

> Rendering is _hard_, and Rust is an uncommon toolchain in the gamedev industry. I don't think wgpu has much to do with it. Vulkan via ash and DirectX12 via windows-rs are both great options in Rust.

Yes. I think I'm beginning to see what's gone wrong with the Rust crates. It's an architectural problem. Vulcano and WGPU try to create a Rust safety perimeter at an API that's basically a wrapper around Vulkan. This may be the wrong boundary for that safety perimeter.

Moving buffer allocation inside the safety perimeter may eliminate a level of locking and checking. Bindless really brings this out, because somebody has to keep the descriptor table and buffer allocation in sync. The GPU depends on that. So that has safety implications.

If this problem is partitioned differently, the locking problems for concurrent GPU content updating may become simpler. Right now, both Vulcano and WGPU force more serialization than Vulkan itself requires. The rendering thread is too often stalled on a lock waiting for some content updating operation that should not interfere with rendering.

Too much detail for this forum. I'll continue this elsewhere. This has been useful.

pjmlp 2 days ago | parent | next [-]

Back in the day I did a similar error with wrapping C graphic libraries directly 1:1 with improved C++ bindings, until I realised it was more ergonomic to think in higher level C++ abstractions, and exposed those concepts instead, fully hiding the underlying unsafe C APIs.

ladyanita22 2 days ago | parent | prev [-]

I'm interested in reading more. Where will you continue this?

kookamamie 4 days ago | parent | prev [-]

> implementing missing features so that they can ship WebGPU support in Firefox

Sounds like WGPU, the project, should be detached from Firefox?

To me the priority of shipping WGPU on FF is kind of mind-boggling, as I consider the browser irrelevant at this point in time.

brookman64k 4 days ago | parent | next [-]

Just to avoid potential confusion: WebGPU and WGPU are different things.

slimsag 4 days ago | parent [-]

(a) WebGPU -> Specification or browser API

(b) WGPU, gfx-rs/wgpu, wgpu.rs -> Rust crate implementing WebGPU

(c) wgpu -> the prefix used by the C API for all WebGPU native implementations.

(d) 'wgpu' -> a cute shorthand used by everyone to describe either (a), (b), or (c) with confusion.

littlestymaar 4 days ago | parent | prev | next [-]

The irrelevant browser is the one paying developers to build wgpu though…

pjmlp 4 days ago | parent | prev | next [-]

Outside of the browser the answer is middleware.

WGPU has to decide, either stay compatible with WebGPU, and thus be constrained by the design of a Web 3D API, or embrace native code and diverge from WebGPU.

slimsag 4 days ago | parent [-]

This is the right answer^

But even more, the level at which WebGPU exists (not too high level, not too low level) necessitates that if a native API graphics abstraction sticks with the WebGPU's API design and only 'extends' it, you actually end up with three totally different ways to use the API:

* The one with your native 'extensions' -> your app will only run natively and never in the browser unless you implement two different WebGPU rendering backends. Also won't run on Chromebook-type devices that only have GLES-era hardware.

* The WebGPU browser API -> your app will run in the browser, but not on GLES-era hardware. Perish in the verbosity of not having bindless support.

* The new 'compatability' mode in WebGPU -> your app runs everywhere, but perish in the verbosity of not having bindless, suffer without reversed-z buffers because the underlying API doesn't support it.

And if you want your app to run in all three as best as possible, you need to write three different webgpu backends for your app, effectively, as if they are different APIs and shading languages.

adrian17 3 days ago | parent [-]

> The WebGPU browser API -> your app will run in the browser, but not on GLES-era hardware. Perish in the verbosity of not having bindless support.

Note, regarding "GLES-era": WGPU does have a GLES/WebGL2 backend; missing WebGL1 is unfortunate, but at least it covers most recent browsers/hardware that happens to not have WebGPU supported yet.

(and there's necessarily some added overhead from having to adapt to GLES-style api; it's especially silly if you consider that the browser might then convert the api calls and shaders _again_ to D3D11 via ANGLE)

slimsag 3 days ago | parent [-]

I am referring primarily to the fact that a restricted subset of WebGPU is needed ('compatibility mode') to support D3D11 / GLES era hardware[0]

[0] https://github.com/gpuweb/gpuweb/issues/4266

flohofwoe 3 days ago | parent [-]

There's a *massive* difference in capabilities between GLES3.0 (e.g. WebGL2) and D3D11 though (GLES3.0 is more like 'late D3D9 era') ;)

And interestingly, WebGL2 in Chrome on Windows (which runs on top of D3D11) handily beats WebGPU in some of my tests (with setBindGroup being the bottleneck).

adastra22 4 days ago | parent | prev [-]

Is WGPU even a Mozilla project? I think he is just saying that those paid developers (paid by Mozilla) have that priority, and everyone else is volunteer. Not that WGPU is a Firefox project.

kookamamie 4 days ago | parent | next [-]

Thanks, I checked the WGPU project's roots and you're right - it's not Mozilla's project, per-se.

jblandy 4 days ago | parent | prev [-]

Yes, this.

jblandy 4 days ago | parent | prev | next [-]

There have been a bunch of significant improvements to WGPU's performance over the last few years.

* Before the major rework called "arcanization", `wgpu_core` used a locking design that caused huge amounts of contention in any multi-threaded program. It took write locks so often I doubt you could get much parallelism at all out of it. That's all been ripped out, and we've been evolving steadily towards a more limited and reasonable locking discipline.

* `wgpu_core` used to have a complex system of "suspected resources" and deferred cleanup, apparently to try to reduce the amount of work that needed to be done when a command buffer finished executing on the GPU. This turned out not to actually save any work at all: it did exactly the same amount of bookkeeping, just at a different time. We ripped out this complexity and got big speedups on some test cases.

* `wgpu_core` used to use Rust generics to generate, essentially, a separate copy of its entire code for each backend (Vulkan, Metal, D3D12) that it used. The idea was that the code generator would be able to see exactly what backend types and functions `wgpu_core` was using, inline stuff, optimize, etc. It also put our build times through the roof. So, to see if we could do something about the build times, Wumpf experimented with making the `wgpu_hal` API use dynamic dispatch instead. For reasons that are not clear to me, switching from generics to dynamic dispatch made WGPU faster --- substantially so on some benchmarks.

Animats posts frequently about performance problems they're running into, but when they do it's always this huge pile of unanalyzed data. It's almost as if, they run into a performance problem with their code, and then rather than figuring out what's going on themselves, they throw their whole app over the wall and ask WGPU to debug the problem. That is just not a service we offer.

ossobuco 3 days ago | parent | next [-]

He's reporting a 23% drop in performance and seems to have invested quite some time in pinning down what's causing it, plus he's provided a repro repository with benchmarks.

I honestly don't get your annoyed response; any OSS project wishes they had such detailed bug reports, and such a performance regression would concern me very much if it happened in a project I maintain.

Animats 4 days ago | parent | prev | next [-]

This is in reference to [1].

[1] https://github.com/gfx-rs/wgpu/issues/6434

jillyboel 3 days ago | parent | prev [-]

What? They even provided a benchmarking tool. You should be ecstatic at users providing such detailed reports. Most projects just attract reports that go like "its slow, fix it!!111"

diggan 3 days ago | parent | prev | next [-]

> When it dropped 21% recently and I pointed that out, some people were very annoyed.[1]

Someone was seemingly "annoyed" by an impatient end-user asking for an status update ("It's now next week. Waiting.") and nothing more. They didn't seem to be annoyed about that you pointed out a performance issue, and instead explained that their current focus is elsewhere.

adastra22 4 days ago | parent | prev | next [-]

Tbh I was annoyed reading it too as an open source developer. The people you are talking to are volunteering their time, and you weren’t very considerate of that. Open source software isn’t the same support structure as paid software. You don’t file tickets and expect them to be promptly fixed, unless you do the legwork yourself.

flohofwoe 4 days ago | parent | prev | next [-]

Tbf, tons of games have been created and are still being created without bindless resource binding. While WebGPU does have some surprising performance bottlenecks around setBindGroup(), details like that hardly make or break a game (if the devs are somewhat competent they'll come up with ways to workaround 3D API limitations - that's how it's always been and always will be - the old batching tricks from the D3D9 era still sometimes make sense, I wonder if people simply forgot about those or don't know them in the first place because it was before their time).

MindSpunk 3 days ago | parent [-]

Nobody forgot about batching. It's a foundational strategy in any efficient realtime renderer. The bar has simply moved and even the cheaper binding logic you get from Vulkan or D3D12 is getting too expensive for the object counts we're trying to push in modern games.

Bindless lets you reduce the amount of book keeping you have to do per-object on the CPU, but much more importantly opens the door for GPU driven rendering.

The problem with WebGPU is there's no bindless and the 'bindful' path is quite expensive to meet the safety requirements of a browser API. There's no way around the slow path, and the slow path is quite slow. In this case the workaround is cut features because the API simply imposes too much overhead.

flohofwoe 3 days ago | parent [-]

BindGroups being a hard to fix design wart is true indeed (which I have been complaining about pretty much from the beginning, not because of the performance problems - which surprised me too - but because of their inflexibility compared to a traditional bindslot based model like in Metal1 or D3D11).

But I would prefer to first bring the peformance of the slot-based binding model to a point where it is similar to D3D11 or Metal instead of ignoring that part of the API and 'skipping ahead' to bindless (which will probably have to be behind an extension anyway). Otherwise WebGPU will become a cemetery of abandondend attempts like OpenGL.

raincole 4 days ago | parent | prev | next [-]

As far as I know, Unity doesn't support bindless either. However thousands of Unity games are released on Steam every year. So it's safe to say performance isn't the main (or major) reason why Rust gamedev isn't getting much traction.

pjmlp 2 days ago | parent | next [-]

The lack of traction is mostly because Rust game development, with exception of Bevy efforts, it is still pretty much on the dark ages of everything is code.

The industry has moved beyond that, with teams where programmers only have a minor role (quite important nontheless), on the whole game design, with plenty of tooling for designers and other non-programmer folks to do their tasks.

Eventually with more graphical tooling, or scripting systems, it will start to gain more steam.

Note that TinyGlade also created most of their tooling in-house, they only partially depend on Bevy.

Animats 4 days ago | parent | prev [-]

That limits Unity's scene size. See [1].

[1] https://discussions.unity.com/t/gpu-bindless-resources-suppo...

0x457 3 days ago | parent [-]

Yes, it introduces limits (duh), but doesn't change the fact that there are plenty of Unity3d games that not only sold well, but also perform well.

z3phyr 3 days ago | parent | prev | next [-]

Another reason is that exploratory programming is hard by design in Rust. Rust is great if you already have a spec and know what needs to be done.

Most of the gamedev in my opinion is extremely exploratory and demands constant experimentation with design. C/C++ offer fluidity, a very good and mature debug toolchain, solid performance ceiling and support from other people.

It will be really hard to replace C++ in performance/simulation contexts. Security takes a backseat there.

efnx 4 days ago | parent | prev | next [-]

Author of Renderling here. Thanks for the shout out Animats!

Bindless is a game changer - pun intended. It can’t happen soon enough.

Just curious, what are the three major projects that were cancelled?

I also want to mention that folks are shipping high performance games in Rust - the first title that comes to mind is “Tiny Glade” which is breathtakingly gorgeous, though it is a casual game. It does not run on wgpu though, to my knowledge. I may have a different definition of high performance, with lower expectations.

Animats 4 days ago | parent [-]

> What are the three major projects that were cancelled?

Here are some:

- LogLog Games [1]. Not happy with Bevy. Not too unhappy about performance, although it's mentioned.

- Moonlight Coffee [2]. Not a major project, but he got as far as loading glTF and displaying the results, then quit. That's a common place to give up.

- Hexops. [3] Found Rust "too hard", switched to Zig.

Tiny Glade is very well done. But, of course, it's a tiny glade. This avoids the scaling problems.

[1] https://news.ycombinator.com/item?id=40172033

[2] https://www.gamedev.net/blogs/entry/2294178-abandoning-the-r...

[3] https://devlog.hexops.com/2021/increasing-my-contribution-to...

slimsag 4 days ago | parent | next [-]

It's crazy you've cited Hexops as an example:

1. It's a game studio not a project (CEO here :))

2. It's very much still alive and well today, not 'cancelled'

3. We never even used WebGPU in Rust, this was before WebGPU was really a thing.

It is true that we looked elsewhere for a better language for us with different tradeoffs, and have since fully embraced Zig. It's also true that we were big proponents of WebGPU earlier on, and have in recent years abandoned WebGPU in favor of something which is better for graphics outside the browser (that's its own worthwhile story)..

But we've never played /any/ role in the Rust gamedev ecosystem, really.

z3phyr 3 days ago | parent [-]

I think the future is getting rid of all the APIs and driver overhead, compile directly to GPU compute and write your own software renderers in a language targeting GPUs (Could be Zig)

LegNeato an hour ago | parent | next [-]

Check out https://renderling.xyz/

jblandy 2 days ago | parent | prev [-]

A better way to think about the problem is to recognize that the APIs and drivers are providing various services that pretty much every user is going to need, and which you will now need to reimplement yourself.

Nobody needs all of Vulkan, but everyone needs quite a bit of it. Buffer allocation? Command encoding? Scheduling? Synchronization? Abstracting GPU architecture differences (and GPUs vary a lot)? Render pipeline fixed-function stages like primitive assembly, tiling, and blending? You're signing up to implement all of that - good luck!

In this view, your idea is the assertion, "I could do a better job at all that stuff than the driver developers." Maybe so! They're only human. Drivers do have bugs. But you're only human too.

z3phyr 2 days ago | parent [-]

I agree; however, engine developers are already dedicated to building massive behemoths of software that is the game engine and they constantly do collaborate with driver devs, essentially sharing much of the same skillset.

Also, the onus is actually on the GPU manufacturers (not game engine devs) to simplify the programmability of the GPUs to the level we have for CPUs (we also do not write microcode, however, the programmability is much much simpler with access to good compiler toolchains). This will massively help non game engine developers who need GPUs for other kinds of compute.

littlestymaar 4 days ago | parent | prev | next [-]

None of those are “major projects” by any definition of the word though. And none of the three has anything to do with wgpu's performance.

Rust for game engine has always been a highly risky endeavor since the ecosystem is much less mature than everything else, and even though things have improved a ton over the past few years, it's still light-years away from the mainstream tools.

Building a complete game ecosystem is very hard and it's not surprising to see that Rust is still struggling.

adastra22 4 days ago | parent | prev [-]

Tiny glade isn’t tiny on the rendering side. It does gorgeous, detailed landscapes.

pjmlp 4 days ago | parent [-]

Indeed, they also do most of the stuff custom.

ladyanita22 4 days ago | parent | prev | next [-]

I thought WGPU only supported WebGPU, and then there were translation libraries (akin to Proton) to run WebGPU over Vulkan.

Does it directly, internally, support Vulkan instead of on-the-fly translation from WebGPU to VK?

flohofwoe 4 days ago | parent [-]

WGPU (https://wgpu.rs/) is one of currently three implementations of the WebGPU specification (the other two being Google's Dawn library used in Chrome, and the implementation in WebKit used in Safari).

The main purpose of WebGPU is to specify a 3D API over the common subset of Metal/D3D12/Vulkan features (e.g. doing an 'on-the-fly translation' of WebGPU API calls to Metal/D3D12/Vulkan API calls, very similar to how (a part of) Proton does an on-the-fly translation of the various D3D API versions to Vulkan.

ladyanita22 4 days ago | parent | next [-]

You're describing the WebGPU spec and its different implementations.

OP claimed WGPU had native support for VK, DX and others. But as far as I know, WGPU just supports WebGPU being translated on the fly to those other backends, with the obvious performance hit. If I'm wrong, I'd be interested to know, as this would make WGPU a more interesting choice for many if, in reality, the code was native instead of translation.

Edit: https://docs.rs/wgpu/latest/wgpu/#backends it seems they indeed support native code in almost every backend?

flohofwoe 4 days ago | parent [-]

I don't understand the question...

Those three WebGPU implementation libraries are compiled to native code (they are written in Rust or C/C++), and at least WGPU and Dawn are usable as native libraries outside the browser in regular native apps that want to use WebGPU as a cross-platform 3D API.

Yet still, those native libraries do a runtime translation of WebGPU API calls to DX/Vk/Metal API calls (and also a runtime translation of either WGSL or SPIRV to the respective 3D backend API shading languages) - in that sense, quite similar to what Proton does, just for a different 'frontend API'.

ladyanita22 2 days ago | parent [-]

Then performance of WGPU will always be problematic, below that of the native APIs (DX, vk and Metal), and constrained within the limits of the WebGPU spec.

pjmlp 4 days ago | parent | prev [-]

Within the constrains of the browser sandbox and 10 year old hardware, which is when its design started.

flohofwoe 4 days ago | parent [-]

I think it's more about low-end mobile GPUs which WebGPU needs to support too (which is also the main reason why Vulkan is such a mess). The feature gap between the low- and high-end is bigger than ever before and will most likely continue to grow.

pjmlp 4 days ago | parent [-]

I am yet to see anyone deliver in WebGL something at the level of Infinity Blade that Apple used to demo OpenGL ES 3.0 capabilities of their 2011 iPhone model, a mobile phone GPU from almost 15 years ago.

Unless we are talking about cool shadertoy examples.

flohofwoe 4 days ago | parent | next [-]

That's more a business problem than a technical problem. Web games are in a local maximum of minimal production cost (via 2D assets) versus maximized profits (via free-2-play), and as long as this works well there won't be an Infinity Blade because it's too expensive to produce.

pjmlp 4 days ago | parent [-]

Yeah, but then what do we want this technology for, besides visualisations and shadertoy demos?

Streaming solves the business case, with native APIs using server side rendering.

flohofwoe 3 days ago | parent [-]

At least it breaks up a chicken-egg problem, and the most interesting use cases are the ones that nobody was expecting anyway.

> Streaming solves the business case, with native APIs using server side rendering.

And yet history is littered with the dead husks of game streaming services ;)

pjmlp 3 days ago | parent [-]

The chicken egg problem caused by killing Flash games, and that nowadays no one cares about the browser, because everyone doing Flash moved into mobile phones or Steam, with better APIs?

Game Pass, GeForce Now, are doing alright.

Stadia failed, because Google doesn't get games industry.

miloignis 4 days ago | parent | prev | next [-]

Tiny Glade? https://store.steampowered.com/app/2198150/Tiny_Glade/

ladyanita22 6 hours ago | parent | next [-]

Tiny Glade is Vulkan, not WebGPU. Not sure which library though

pjmlp 4 days ago | parent | prev [-]

Where is the URL for the game on the browser?

astlouis44 4 days ago | parent | prev [-]

Try this Unreal Engine 5 WebGPU demo: https://play.spacelancers.com/

pjmlp 3 days ago | parent [-]

Demo, not game.

Shadertoy is full of impressive demos.

klysm 4 days ago | parent | prev [-]

The tone of the thread was perfectly fine until you made a passive aggressive comment