Article:https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/Github:https://github.com/microsoft/DeepSpeed