Large Languge Models and Alignment

Table of Contents

Introduction

Aligning large language models (LLMs) is a cutting-edge AI technology that ensures these models behave according to human intentions and values. This process involves techniques like reinforcement learning (RL), supervised fine-tuning, contextual learning, and socio-technical alignment. The course syllabus covers topics from foundational theories of LLMs to practical applications in alignment.

Course Structure: follow mainstream LLMs development pathways, emphasizing pre-training, supervised fine-tuning, and reinforcement learning from human feedback (RLHF). It systematically examines key algorithms behind LLMs and covers widely used algorithms like DPO and others;

Hardware: include NVIDIA hardware architecture and programming tools used in modern AI systems, focusing on how these technologies accelerate AI computation, optimize performance, and enable efficient neural network training;

Safety and Value Alignment: understand the importance of safety and value alignment in LLMs and cover advanced topics like model evaluation and governance, supporting the practical deployment of LLMs.

Location and Time

  • Location: 理教407
  • Time: Tuesday 15:10-18:00, Week 1-16;每周二下午15:10-18:00,共16周

Schedule and Plan

课程内容Course Content日期Date
大语言模型介绍Introduction to LLMs02/1802/18
大模型基础架构Basic Architecture of LLMs02/2502/25
大模型预训练Pre-training of LLMs03/0403/04
大模型推理与思维链Inference and Chain of Thought in LLMs03/1103/11
大模型推理与微调Inference and Fine-tuning of LLMs03/1803/18
大模型高效微调法Efficient Fine-tuning Methods for LLMs03/2503/25
强化学习精要Essentials of Reinforcement Learning04/0104/01
策略优化方法Policy Optimization Methods04/0804/08
RLHF模型对齐方法RLHF Alignment Methods04/1504/15
直接对齐方法Direct Alignment Methods04/2204/22
具身多模态模型对齐Embodied Multimodal Model Alignment04/2904/29
Nvidia现代AI训练架构Nvidia Modern AI Training Architecture05/1305/13
GPU/CUDA编程实践GPU/CUDA Programming Practice05/2005/20
DPU编程实践 (I)DPU Programming Practice (I)05/2705/27
DPU编程实践 (II)DPU Programming Practice (II)06/0306/03

Contact Info