09.18
Instructor: Yaodong Yang
Topics Covered
- Transformer Architecture
- Attention Mechanisms
- Positional Encoding
- Cross-entropy Loss
- Training and Inference in Transformers
- GPT-2 and LLaMA Architectures
- Mixture-of-Experts Architecture
09.18
Instructor: Yaodong Yang