| 1 |
Introduction to Reinforcement Learning |
Agent-environment interaction, MDP basics |
Concept 01: Introduction to Reinforcement Learning |
Activity 01: Introduction to Reinforcement Learning |
- |
ET-1 |
Submit Activity |
| 2 |
Q-Learning and Value Functions |
Tabular Q-Learning, epsilon-greedy exploration |
Concept 02: Q-Learning and Value Functions |
Activity 02: Q-Learning and Value Functions |
- |
ET-2 |
Submit Activity |
| 3 |
Deep Q-Networks (DQN) |
Function approximation, experience replay |
Concept 03: Deep Q-Networks (DQN) |
Activity 03: Deep Q-Networks (DQN) |
- |
ET-3 |
Submit Activity |
| 4 |
Project One: DQN Game Master |
Train DQN agent to master Atari game |
- |
- |
Project One: DQN Game Master |
- |
Submit Project |
| 5 |
Policy Gradient Methods |
Policy-based RL, REINFORCE algorithm |
Concept 04: Policy Gradient Methods |
Activity 04: Policy Gradient Methods |
- |
ET-4 |
Submit Activity |
| 6 |
Actor-Critic Methods |
Actor-critic architecture, advantage estimation |
Concept 05: Actor-Critic Methods |
Activity 05: Actor-Critic Methods |
- |
ET-5 |
Submit Activity |
| 7 |
Proximal Policy Optimization (PPO) |
Trust region methods, PPO clipping |
Concept 06: Proximal Policy Optimization (PPO) |
Activity 06: Proximal Policy Optimization (PPO) |
- |
ET-6 |
Submit Activity |
| 8 |
Project 2: Autonomous Robot Navigation |
PPO agent for continuous control |
- |
- |
Project 2: Autonomous Robot Navigation |
- |
Submit Project |
| 9 |
Multi-Armed Bandits and Exploration |
Bandit problems, UCB, contextual bandits |
Concept 07: Multi-Armed Bandits and Exploration |
Activity 07: Multi-Armed Bandits and Exploration |
- |
ET-7 |
Submit Activity |
| 10 |
RL in Practice - Debugging and Deployment |
Debug RL failures, reward shaping |
Concept 08: RL in Practice - Debugging and Deployment |
Activity 08: RL in Practice - Debugging and Deployment |
- |
ET-8 |
Submit Activity |
| 11 |
Introduction to Generative Models |
Generative vs discriminative, latent spaces |
Concept 09: Introduction to Generative Models |
Activity 09: Introduction to Generative Models |
- |
ET-9 |
Submit Activity |
| 12 |
Variational Autoencoders (VAEs) |
Encoder-decoder, reparameterization trick |
Concept 10: Variational Autoencoders (VAEs) |
Activity 10: Variational Autoencoders (VAEs) |
- |
ET-10 |
Submit Activity |
| 13 |
Projects 3-4: Generative Models Workshop |
GAN art generation and VAE latent space exploration |
- |
- |
Project 3: GAN Art Studio + Project 4: Latent Space Explorer |
- |
Submit Project |
| 14 |
Generative Adversarial Networks (GANs) |
Adversarial training, minimax objective |
Concept 11: Generative Adversarial Networks (GANs) |
Activity 11: Generative Adversarial Networks (GANs) |
- |
ET-11 |
Submit Activity |
| 15 |
Advanced GAN Architectures |
StyleGAN features, conditional GANs, WGAN-GP |
Concept 12: Advanced GAN Architectures |
Activity 12: Advanced GAN Architectures |
- |
ET-12 |
Submit Activity |
| 16 |
Diffusion Models |
Denoising diffusion, U-Net architecture |
Concept 13: Diffusion Models |
Activity 13: Diffusion Models |
- |
ET-13 |
Submit Activity |
| 17 |
Transformer Architectures for Generation |
Self-attention, autoregressive generation |
Concept 14: Transformer Architectures for Generation |
Activity 14: Transformer Architectures for Generation |
- |
ET-14 |
Submit Activity |
| 18 |
Large Language Models (LLMs) Fundamentals |
LLM architecture, prompt engineering |
Concept 15: Large Language Models (LLMs) Fundamentals |
Activity 15: Large Language Models (LLMs) Fundamentals |
- |
ET-15 |
Submit Activity |
| 19 |
Reinforcement Learning from Human Feedback (RLHF) |
3-stage RLHF pipeline, reward model training |
Concept 16: Reinforcement Learning from Human Feedback (RLHF) |
Activity 16: Reinforcement Learning from Human Feedback (RLHF) |
- |
ET-16 |
Submit Activity |
| 20 |
Project 5: Text Generation with RLHF |
Align LLM with human preferences |
- |
- |
Project 5: Text Generation with RLHF |
- |
Submit Project |
| 21 |
Multi-Modal AI - Vision and Language |
Cross-modal learning, CLIP, text-to-image |
Concept 17: Multi-Modal AI - Vision and Language |
Activity 17: Multi-Modal AI - Vision and Language |
- |
ET-17 |
Submit Activity |
| 22 |
Project 6: Multi-Modal Content Generator |
Text-to-image and image-to-text pipelines |
- |
- |
Project 6: Multi-Modal Content Generator |
- |
Submit Project |
| 23 |
The Future of AI - Integration and Ethics |
RL + GenAI integration, AI safety |
Concept 18: The Future of AI - Integration and Ethics |
Activity 18: The Future of AI - Integration and Ethics |
- |
ET-18 |
Submit Activity |
| 24 |
Project 7: Capstone - AI Agent Ecosystem |
Integrated AI system (student-designed) |
- |
- |
Project 7: Capstone - AI Agent Ecosystem |
- |
Submit Project |