Extensive Benchmarks Looking At AMD Znver1 GCC 9 Performance, EPYC Compiler Tuning

Written by Michael Larabel in Software on 20 February 2019 at 11:26 AM EST. Page 4 of 6. 13 Comments.

Rather than just comparing -march=x86-64 and -march=znver1, this second part to the AMD EPYC compiler testing is looking at the various optimization levels when using the GCC 9 snapshot on this AMD EPYC 7601 2P server.

GCC 9.0 Znver1 x86-64 Linux Compiler Benchmarks

The FFTW benchmark shows the common case of where at least hitting -O1 or even -Og yields much of the performance gains that there is to make out of the GCC compiler optimizations. But it does also show that link-time optimizations can pay off with delivering an 11% increase in performance over just the "-O3 -march=znver1" run.

GCC 9.0 Znver1 x86-64 Linux Compiler Benchmarks

The HMMer sequence analysis program meanwhile shows one of the cases where the (potentially unsafe) -Ofast optimization level pays off, but aside from that not too much of a difference between -O1 and -O3.

GCC 9.0 Znver1 x86-64 Linux Compiler Benchmarks

The SciMark2 micro-benchmarks show nicely the progression of compiler optimization levels and their impact on performance, but in this case -Ofast was slower than -O3. Link-time optimizations don't pay off since SciMark2 is a single source file anyhow.

GCC 9.0 Znver1 x86-64 Linux Compiler Benchmarks

If you give John The Ripper any level of optimizations, it's happy enough.

GCC 9.0 Znver1 x86-64 Linux Compiler Benchmarks

That's a similar story with x264, which for the performance sensitive paths is hand-tuned Assembly.


Related Articles