Remix.run Logo
timschmidt a day ago

And? Software is getting more sophisticated and capable too. First time I switched an iter to a par_iter in Rust and saw the loop spawn as many threads as I have logical cores felt like magic. Writing multi-threaded code used to be challenging.

pjmlp a day ago | parent [-]

Now make that multi-threaded code exhaust a 32 core desktop system, all the time, not only at peak execution.

As brownie points, keep the GPU busy as well, beyond twirling its fingers while keeping the GUI desktop going.

Even more points if the CPU happens to have a NPU or integrated FPGA, and you manage to also keep them going alongside those 32 cores, and GPU.

timschmidt a day ago | parent [-]

> Now make that multi-threaded code exhaust a 32 core desktop system

Switching an iter to par_iter does this. So long as there are enough iterations to work through, it'll exhaust 1024 cores or more.

> all the time, not only at peak execution.

What are you doing that keeps a desktop or phone at 100% utilization? That kind of workload exists in datacenters, but end user devices are inherently bursty. Idle when not in use, race to idle while in use.

> As brownie points, keep the GPU busy as well... Even more points if the CPU happens to have a NPU or integrated FPGA

In a recent project I serve a WASM binary from an ESP32 via Wifi / HTTP, which makes use of the GPU via WebGL to draw the GUI, perform CSG, calculate toolpaths, and drip feed motion control commands back to the ESP. This took about 12k lines of Rust including the multithreaded CAD library I wrote for the project, only a couple hundred lines of which are gated behind the "parallel" feature flag. It was way less work than the inferior C++ version I wrote as part of the RepRap project 20 years ago. Hence my stance that software has become increasingly sophisticated.

https://github.com/timschmidt/alumina-firmware

https://github.com/timschmidt/alumina-ui

https://github.com/timschmidt/csgrs

What's your point?

pjmlp a day ago | parent [-]

The point being those are very niche cases that still don't keep the hardware busy as it should 24h around the clock.

Most consumer software even less, hence why anyone will hardly see a computer on the shopping mall with higher than 16 core count, and on average most shops will have something between 4 and 8.

Also a reason why systems with built-in FPGAs failed in the consumer market, specialised tools without consumer software to help sell them.

timschmidt a day ago | parent [-]

> don't keep the hardware busy as it should 24h around the clock.

If your workload demands 24/7 100% CPU usage, Epyc and Xeon are for you. There you can have multiple sockets with 256 or more cores each.

> Most consumer software even less

And yet, even in consumer gear which is built to a minimum spec budget, core counts, memory capacity, pcie lanes, bus bandwidth, IPC, cache sizes, GPU shaders, NPU TOPS, all increasing year over year.

> systems with built-in FPGAs failed in the consumer market

Talk about niche. I've never met an end user with a use for an FPGA or the willingness to learn what one is. I'd say that has more to do with it. Write a killer app that regular folks want to use that requires one, and they'll become popular. Rooting for you.

pjmlp 15 hours ago | parent [-]

You have to root for those hardware designers to have software devs in quantities, actually using what they produce, at scale.