We were benchmarking our networks on Intel vs AMD processors using out-of-the-box official build.
And Intel mostly is better (with the same number of threads and roughly the same core speed and lack of overclocking). I was wondering why this is, and then I found this thread.
To be honest I have little motivation to invest time in redoing our environment builds from scratch with OpenBLAS + CUDA (and most likely it is not worth the time since in production most likely there will be Intel CPUs).
But I wonder, does anyone in the community have dockerized dev environment builds based around CUDA + OpenBLAS? Because looks like out of the box PyTorch ships with MKL by Intel.