Remix.run Logo
huslage 2 days ago

I don't believe the standard supports such a thing. But I wonder if TB6 will.

kmeisthax 2 days ago | parent [-]

RDMA is a networking standard, it's supposed to be switched. The reason why it's being done over Thunderbolt is that it's the only cheap/prosumer I/O standard with enough bandwidth to make this work. Like, 100Gbit Ethernet cards are several hundred dollars minimum, for two ports, and you have to deal with SFP+ cabling. Thunderbolt is just way nicer[0].

The way this capability is exposed in the OS is that the computers negotiate an Ethernet bridge on top of the TB link. I suspect they're actually exposing PCIe Ethernet NICs to each other, but I'm not sure. But either way, a "Thunderbolt router" would just be a computer with a shitton of USB-C ports (in the same way that an "Ethernet router" is just a computer with a shitton of Ethernet ports). I suspect the biggest hurdle would actually just be sourcing an SoC with a lot of switching fabric but not a lot of compute. Like, you'd need Threadripper levels of connectivity but with like, one or two actual CPU cores.

[0] Like, last time I had to swap work laptops, I just plugged a TB cable between them and did an `rsync`.

bleepblap 2 days ago | parent [-]

I think you might be swapping RDMA with RoCE - RDMA can happen entirely within a single node. For example between an NVME and a GPU.

wmf 2 days ago | parent [-]

Within a single node it's just called DMA. RDMA is DMA over a network and RoCE is RDMA over Ethernet.

bleepblap 2 days ago | parent [-]

Sorry, but it certainly isn't--

https://docs.nvidia.com/cuda/gpudirect-rdma/index.html

The "R" in RDMA means there are multiple DMA controllers who can "transparently" share address spaces. You can certainly share address spaces across nodes with RoCE or Infiniband, but thats a layer on top

wtallis 2 days ago | parent | next [-]

I don't know why that NVIDIA document is wrong, but the established term for doing DMA from eg. an NVMe SSD to a GPU within a single system without the CPU initiating the transfer is peer to peer DMA. RDMA is when your data leaves the local machine's PCIe fabric.

wmf 2 days ago | parent | prev [-]

I'm going to agree to disagree with Nvidia here.