SNC/NPS Tuning For Ryzen Threadripper 7000 Series To Further Boost Performance
Database software tended to perform the best in the default (disabled) mode.
The Graph500 HPC benchmark had some stellar improvements in SNC2 and then SNC4 modes.
PyTorch and TensorFlow didn't benefit from the NUMA topology adjustments...
But the OpenVINO AI toolkit did benefit from the Sub-NUMA Clustering controls with this HP workstation BIOS. There were minor gains to the throughput for these AI benchmarks but where it was really dramatic is lower latency during these inference tests.
Meanwhile for software like PetSc there was no measurable difference.
It comes down to the particular software of interest/use on your AMD Ryzen Threadripper workstation whether it's a wise idea adjusting the Sub-NUMA Clustering / Nodes Per Socket default. For NUMA-aware software this can mean some very nice performance gains as shown in cases like OpenVINO, OpenFOAM, Graph500, LULESH, code compilation workloads, etc. Those upgrading to an AMD Ryzen Threadripper 7000 series system and wanting to see all 196 benchmarks I ran in full for this SNC2/SNC4 comparison can find the data via this result page. NPS adjustments are a common consideration in the EPYC server/HPC space but for Threaderipper processors as well this can be a very beneficial setting worth proper consideration.
Thanks to HP for supplying the HP Z6 G5 A workstation for review on Phoronix that has made all of this Ryzen Threadripper PRO 7995WX testing possible.
If you enjoyed this article consider joining Phoronix Premium to view this site ad-free, multi-page articles on a single page, and other benefits. PayPal or Stripe tips are also graciously accepted. Thanks for your support.