MlPerf Training v2.1 Results
New Nvidia press releases and ML Perf benchmarks also seem more and more hypocritical each time.
Notable:
- No consumer GPUs at all
- Highly sublinear scaling for many tasks (4-8 GPUs seem most efficient)
- No proper apples-to-apples H100 test
- AMD EPYC Vendor lock in place
- "Open" tests are barely there
- As predicted this is a playground for hardware vendors
- On RNNT and Imagenet H100 2x faster, on BERT - 3-4x