Practice and reinforce the concepts from Lesson 11
In this activity, you'll build a complete GAN from scratch to generate realistic images. You'll implement the generator, discriminator, adversarial training loop, and stabilization techniques. Then you'll train on CIFAR-10 and explore the unique challenges of GAN training (mode collapse, convergence).
By completing this activity, you will:
Download the activity template from the Templates folder:
AI25-Template-activity-11-generative-adversarial-networks.zipTemplates/AI25-Template-activity-11-generative-adversarial-networks.zipactivity-11-generative-adversarial-networks.ipynb to Google ColabExecute the first few cells to:
TODO 1: Implement DCGAN generator with transposed convolutions
class Generator(nn.Module):
def __init__(self, latent_dim=100, ngf=64):
"""
DCGAN Generator
Args:
latent_dim: Size of latent vector z
ngf: Number of generator filters (base)
"""
super().__init__()
# TODO 1: Implement generator architecture
# Input: (batch, latent_dim, 1, 1)
# Output: (batch, 3, 32, 32) - RGB image
#
# Architecture (DCGAN paper):
# ConvTranspose2d: latent_dim → ngf*4 (4×4)
# BatchNorm + ReLU
# ConvTranspose2d: ngf*4 → ngf*2 (8×8)
# BatchNorm + ReLU
# ConvTranspose2d: ngf*2 → ngf (16×16)
# BatchNorm + ReLU
# ConvTranspose2d: ngf → 3 (32×32)
# Tanh (output in [-1, 1])
# Your code here
pass
def forward(self, z):
# TODO 1: Implement forward pass
# z: (batch, latent_dim, 1, 1)
# Returns: Generated image (batch, 3, 32, 32)
# Your code here
pass
TODO 2: Implement DCGAN discriminator with strided convolutions
class Discriminator(nn.Module):
def __init__(self, ndf=64):
"""
DCGAN Discriminator
Args:
ndf: Number of discriminator filters (base)
"""
super().__init__()
# TODO 2: Implement discriminator architecture
# Input: (batch, 3, 32, 32) - RGB image
# Output: (batch, 1) - Real/Fake score
#
# Architecture (DCGAN paper):
# Conv2d: 3 → ndf (16×16) [NO BatchNorm on first layer]
# LeakyReLU(0.2)
# Conv2d: ndf → ndf*2 (8×8)
# BatchNorm + LeakyReLU(0.2)
# Conv2d: ndf*2 → ndf*4 (4×4)
# BatchNorm + LeakyReLU(0.2)
# Conv2d: ndf*4 → 1 (1×1)
# Sigmoid (probability real)
# Your code here
pass
def forward(self, x):
# TODO 2: Implement forward pass
# x: (batch, 3, 32, 32)
# Returns: Real/fake score (batch, 1)
# Your code here
pass
TODO 3: Implement minimax loss for generator and discriminator
def discriminator_loss(d_real_output, d_fake_output):
"""
Discriminator loss: maximize log(D(x)) + log(1 - D(G(z)))
Args:
d_real_output: Discriminator scores on real images
d_fake_output: Discriminator scores on fake images
Returns:
d_loss: Discriminator loss
"""
# TODO 3a: Implement discriminator loss
# Real loss: BCE(d_real_output, ones) - want D(real) = 1
# Fake loss: BCE(d_fake_output, zeros) - want D(fake) = 0
# Total: real_loss + fake_loss
# Your code here
pass
def generator_loss(d_fake_output):
"""
Generator loss: maximize log(D(G(z)))
Equivalent to: minimize log(1 - D(G(z)))
Non-saturating version: minimize -log(D(G(z)))
Args:
d_fake_output: Discriminator scores on fake images
Returns:
g_loss: Generator loss
"""
# TODO 3b: Implement generator loss
# Non-saturating GAN: BCE(d_fake_output, ones) - want D(G(z)) = 1
# Your code here
pass
TODO 4: Implement adversarial training loop
def train_gan(generator, discriminator, dataloader, num_epochs=25):
"""
Train GAN with adversarial training
Training procedure:
1. Update Discriminator:
- Compute D loss on real images
- Compute D loss on fake images (from G)
- Backprop and update D
2. Update Generator:
- Generate fake images
- Compute G loss (fool D)
- Backprop and update G
"""
# Optimizers
g_optimizer = torch.optim.Adam(generator.parameters(), lr=0.0002, betas=(0.5, 0.999))
d_optimizer = torch.optim.Adam(discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999))
for epoch in range(num_epochs):
for i, (real_images, _) in enumerate(dataloader):
batch_size = real_images.size(0)
real_images = real_images.to(device)
# TODO 4a: Update Discriminator
# Step 1: Discriminator loss on real images
# - Forward pass: d_real = discriminator(real_images)
# - Real loss
# Step 2: Discriminator loss on fake images
# - Generate fakes: z = randn, fake_images = generator(z)
# - Forward pass: d_fake = discriminator(fake_images.detach())
# - Fake loss
# Step 3: Total D loss and update
# - d_loss = real_loss + fake_loss
# - Backprop and step d_optimizer
# Your code here for discriminator update
# TODO 4b: Update Generator
# Step 1: Generate fake images
# - z = randn, fake_images = generator(z)
# Step 2: Generator loss
# - Forward pass: d_fake = discriminator(fake_images)
# - g_loss = generator_loss(d_fake)
# Step 3: Update
# - Backprop and step g_optimizer
# Your code here for generator update
# Log progress
if i % 100 == 0:
print(f"[{epoch}/{num_epochs}][{i}/{len(dataloader)}] "
f"D_loss: {d_loss.item():.4f} G_loss: {g_loss.item():.4f}")
TODO 5: Implement stabilization techniques
def train_gan_stable(generator, discriminator, dataloader, num_epochs=25):
"""
Train GAN with stabilization techniques
"""
# TODO 5: Add stabilization techniques
# 1. Label smoothing: real labels = 0.9 instead of 1.0
# 2. Noisy labels: flip labels occasionally (5%)
# 3. Gradient penalty: Add to discriminator loss
# 4. Feature matching: Match discriminator features (optional)
# Example: Label smoothing
real_label = 0.9
fake_label = 0.0
for epoch in range(num_epochs):
for i, (real_images, _) in enumerate(dataloader):
# TODO 5: Implement stabilized training
# Same structure as TODO 4, but with:
# - Smoothed labels
# - Optional label flipping
# - Gradient clipping
# Your code here
pass
Pre-built utilities to detect and visualize mode collapse:
Features:
<50% classes representedTODO 6: Implement conditional GAN for class-specific generation
class ConditionalGenerator(nn.Module):
def __init__(self, num_classes=10, latent_dim=100, ngf=64):
super().__init__()
# TODO 6a: Modify generator to accept class labels
# Approach 1: Concatenate z and one-hot label
# Approach 2: Conditional batch normalization
# Your code here
pass
def forward(self, z, labels):
# TODO 6a: Forward pass with conditioning
# z: (batch, latent_dim)
# labels: (batch,) - class indices
# Your code here
pass
class ConditionalDiscriminator(nn.Module):
def __init__(self, num_classes=10, ndf=64):
super().__init__()
# TODO 6b: Modify discriminator to accept class labels
# Approach: Concatenate image and class embedding
# Your code here
pass
def forward(self, x, labels):
# TODO 6b: Forward pass with conditioning
# Your code here
pass
Training Progress (CIFAR-10, 25 epochs):
[0/25][0/782] D_loss: 1.42 G_loss: 0.89
[5/25][0/782] D_loss: 0.68 G_loss: 1.35
[10/25][0/782] D_loss: 0.52 G_loss: 1.78
[20/25][0/782] D_loss: 0.47 G_loss: 1.92
✓ Discriminator converges to ~0.5 (balanced)
✓ Generator loss increases (harder to fool D)
✓ Training stable (no divergence)
Generated Samples (Epoch 25):
✓ Recognizable objects (cars, planes, animals)
✓ Realistic colors and textures
✓ Sharper than VAE outputs
✓ Some artifacts but generally good quality
With Stabilization Techniques:
✓ Smoother loss curves
✓ Fewer training failures
✓ Better sample diversity
✓ Faster convergence
Comparison:
Standard GAN: 30% training runs diverge
Stabilized GAN: 5% training runs diverge
✓ 6× more reliable training
Healthy Training:
Generated 1000 samples
Class distribution:
- Airplane: 105 (10.5%)
- Automobile: 98 (9.8%)
- Bird: 102 (10.2%)
- ...
- Truck: 95 (9.5%)
✓ All 10 classes represented
✓ Balanced distribution
✓ No mode collapse detected
Mode Collapse Example:
Generated 1000 samples
Class distribution:
- Airplane: 0 (0%)
- Automobile: 823 (82.3%) ← Mode collapse!
- Bird: 0 (0%)
- ...
✗ Only 2/10 classes generated
✗ Mode collapse detected!
Class-Specific Generation:
Generate 10 "airplane" images:
✓ All 10 are recognizable airplanes
✓ Different poses/colors (diversity)
Generate 10 "dog" images:
✓ All 10 are recognizable dogs
✓ Different breeds/positions
✓ Conditional generation works!
Your implementation is complete when:
Common Failures:
One. Generator Collapse (outputs noise):
2. Mode Collapse:
3. Training Instability:
4. Vanishing Gradients:
DO:
DON'T:
| Parameter | Recommended | Effect |
|---|---|---|
| Latent dim | 100 | Standard for DCGAN |
| Learning rate | 0.0002 | Both G and D |
| Beta1 (Adam) | 0.5 | Lower than default (0.9) |
| Batch size | 128 | Larger = more stable |
| Epochs | 25-50 | More = better quality |
Implement progressive training (start 4x4, grow to 32x32):
# Start with low resolution
train_gan(g_4x4, d_4x4, epochs=10)
# Upsample and add layers
g_8x8 = grow_generator(g_4x4)
d_8x8 = grow_discriminator(d_4x4)
train_gan(g_8x8, d_8x8, epochs=10)
# Continue to 32×32
Benefit: More stable training, higher quality
Replace BCE loss with Wasserstein distance:
def wgan_d_loss(d_real, d_fake):
return -torch.mean(d_real) + torch.mean(d_fake)
def wgan_g_loss(d_fake):
return -torch.mean(d_fake)
def gradient_penalty(discriminator, real, fake):
"""Compute gradient penalty for WGAN-GP"""
pass
Benefit: More stable, better convergence
Add self-attention layers to G and D:
class SelfAttention(nn.Module):
def __init__(self, in_channels):
# Compute attention maps
# Apply to feature maps
pass
Benefit: Better global coherence
Compute Fréchet Inception Distance:
from scipy.linalg import sqrtm
def compute_fid(real_features, fake_features):
"""
FID = ||mu_real - mu_fake||^2 + Tr(Sigma_real + Sigma_fake - 2*sqrt(Sigma_real * Sigma_fake))
"""
pass
Lower FID = better quality
Completed Notebook: activity-11-generative-adversarial-networks.ipynb
Generated Samples:
Training Curves:
Comparison:
Analysis (5-7 sentences):
soumith/ganhacksNext Activity: Activity 12 - Advanced GAN Architectures
This activity is graded on:
Passing Grade: 70% or higher
Congratulations on building your first GAN! 🎉🎨