Remix.run Logo
tanelpoder 5 days ago

Yup, for best results you wouldn't just dump your existing pointer-chasing and linked-list data structures to CXL (like the Optane's transparent mode was, whatever it was called).

But CXL-backed memory can use your CPU caches as usual and the PCIe 5.0 lane throughput is still good, assuming that the CXL controller/DRAM side doesn't become a bottleneck. So you could design your engines and data structures to account for these tradeoffs. Like fetching/scanning columnar data structures, prefetching to hide latency etc. You probably don't want to have global shared locks and frequent atomic operations on CXL-backed shared memory (once that becomes possible in theory with CXL3.0).

Edit: I'll plug my own article here - if you've wondered whether there were actual large-scale commercial products that used Intel's Optane as intended then Oracle database took good advantage of it (both the Exadata and plain database engines). One use was to have low latency durable (local) commits on Optane:

https://tanelpoder.com/posts/testing-oracles-use-of-optane-p...

VMware supports it as well, but using it as a simpler layer for tiered memory.

packetlost 5 days ago | parent | next [-]

> You probably don't want to have global shared locks and frequent atomic operations on CXL-backed shared memory (once that becomes possible in theory with CXL3.0).

I'd bet contested locks spend more time in cache than most other lines of memory so in practice a global lock might not be too bad.

tanelpoder 5 days ago | parent [-]

Yep agreed, for single-host with CXL scenarios. I wrote this comment thinking about a hypothetical future CXL3.x+ scenario with multi-host fabric coherence where one could in theory put locks and control structures that protect shared access to CXL memory pools into the same shared CXL memory (so, no need for coordination over regular network at least).

samus 4 days ago | parent | prev [-]

DBMSs have been managing storage with different access times for decades and it should be pretty easy to adapt an existing engine. Or you could use it as a gigantic swap space. No clue whether additional kernel patches would be required for that.