Frameworks
??? code "DeepSpeed ZeRO++ A framework for accelerating model pre-training, finetuning, RLHF updating." deepspeed By minimizing communication overhead. A likely essential concept to be very familiar with.
Levanter (not just LLMS) Codebase for training FMs with JAX.
Release Using Haliax for naming tensors field names instead of indexes. (for example Batch, Feature....). Full sharding and distributable/parallelizable.