| ▲ | kmaitreys 2 hours ago | |||||||
Right. I have seen Zig where one needs to specify allocators as well. I'm sorry I'm not well versed enough to know how it makes things better for HPC though? For now my plan is to write fairly similar style code as one may write in C++/Fortran through MPI bindings in Rust. | ||||||||
| ▲ | convolvatron an hour ago | parent [-] | |||||||
if you're using thread level parallelism, there is always a benefit to having a per-thread allocator so that you don't have to take global locks to get memory, they become highly contended. if you take that one step further and only use those objects on a single core, now your default model is lock-free non-shared objects. at large scale that becomes kind of mandatory. some large shared memory machines even forgo cache consistency because you really can't do it effectively at large scale anyways. but all of this is highly platform dependent, and I wouldn't get too wrapped up around it to begin with. I would encourage you though to worry first about expressing your domain semantics, with the understanding that some refactoring for performance will likely be necessary. if you have the patience and personally and within the project, it can be a lot of fun to really get in there and think about the necessary dependencies and how they can be expressed on the hardware. there's a lot of cool tricks, for example trading off redundant computation to reduce the frequency of communication. | ||||||||
| ||||||||