My First Test With Torch 2.0
So I tried running a small model and a large model.
First of all it failed on an STFT module with a plain nn.Module model. No problem, let's try a JIT model, though it is not described in the tutorial.
On AMD CPU - it works, but there absolutely no difference. And the first run does NOT take ages. Then I remembered that AMD was not listed as optimized backend. Looks like it just does not do anything.
Then I tried an Ampere GPU - and still no difference and no "very long first" run. I tried a dynamic shape with real random inputs - and got some cryptic errors.
No conclusions yet. Need more testing on other models / hardware.