| ▲ | antiterra 4 hours ago | |
I'm not a hardware expert here but this strikes me as inaccurate, though the actual performance can be scenario dependent. The Jetson hardware is targeted to low power robotics implementations. The Jetson Orin is currently marketed as prototyping platform, and I believe it does not generally challenge recent Apple Silicon for inference performance, even considering prefill. In the latest Blackwell based Jetson Thor, the key advantage over Apple Silicon is its capable FP4 tensor cores, which do indeed help with prefill. However, it also has half the memory bandwidth of an M4 Max, so this puts a big bottleneck on token generation with large context. If your use case did some kind of RAG lookup with very short responses then you might come out ahead using an optimized model, but for straightforward inference you are likely to lag behind Apple Silicon. At this stage, professional inference solutions ideally use discrete GPUs that are far more capable than either, but those are a different class of monetary expense. | ||
| ▲ | aegis_camera 4 hours ago | parent [-] | |
You do have a deep understanding of AI hardware landscape. Thanks for your analysis. | ||