Demo Mode

No student ID available

Activity 11 of 18

Activity 11: Generative Adversarial Networks (GANs)

Practice and reinforce the concepts from Lesson 11

Activity 11: Generative Adversarial Networks (GANs)

Overview

In this activity, you'll build a complete GAN from scratch to generate realistic images. You'll implement the generator, discriminator, adversarial training loop, and stabilization techniques. Then you'll train on CIFAR-10 and explore the unique challenges of GAN training (mode collapse, convergence).

Learning Objectives

By completing this activity, you will:

Implement a DCGAN (Deep Convolutional GAN) architecture
Train generator and discriminator in adversarial setup
Apply training stabilization techniques (label smoothing, noise)
Debug common GAN failures (mode collapse, vanishing gradients)
Generate high-quality images from random noise
Compare GAN vs VAE sample quality
Implement conditional GAN for class-specific generation

Prerequisites

Completed Concept 11: Generative Adversarial Networks (GANs)
Completed Activity 10: Variational Autoencoders (VAEs)
Strong understanding of PyTorch and convolutional neural networks

Getting Started

Step One: Access the Template

Download the activity template from the Templates folder:

Template: AI25-Template-activity-11-generative-adversarial-networks.zip
Location: Templates/AI25-Template-activity-11-generative-adversarial-networks.zip

Step 2: Open in Google Colab

Extract the ZIP file
Upload activity-11-generative-adversarial-networks.ipynb to Google Colab
Set Runtime to GPU: Runtime -> Change runtime type -> GPU (T4 required)

Step 3: Run Initial Cells

Execute the first few cells to:

Install PyTorch, torchvision
Import libraries
Load CIFAR-10 dataset
Set up visualization and logging

What You'll Build

Part One: Generator Architecture (YOU COMPLETE)

TODO 1: Implement DCGAN generator with transposed convolutions

python

class Generator(nn.Module):
    def __init__(self, latent_dim=100, ngf=64):
        """
        DCGAN Generator

        Args:
            latent_dim: Size of latent vector z
            ngf: Number of generator filters (base)
        """
        super().__init__()

        # TODO 1: Implement generator architecture
        # Input: (batch, latent_dim, 1, 1)
        # Output: (batch, 3, 32, 32) - RGB image
        #
        # Architecture (DCGAN paper):
        # ConvTranspose2d: latent_dim → ngf*4 (4×4)
        # BatchNorm + ReLU
        # ConvTranspose2d: ngf*4 → ngf*2 (8×8)
        # BatchNorm + ReLU
        # ConvTranspose2d: ngf*2 → ngf (16×16)
        # BatchNorm + ReLU
        # ConvTranspose2d: ngf → 3 (32×32)
        # Tanh (output in [-1, 1])

        # Your code here
        pass

    def forward(self, z):
        # TODO 1: Implement forward pass
        # z: (batch, latent_dim, 1, 1)
        # Returns: Generated image (batch, 3, 32, 32)

        # Your code here
        pass

Part 2: Discriminator Architecture (YOU COMPLETE)

TODO 2: Implement DCGAN discriminator with strided convolutions

python

class Discriminator(nn.Module):
    def __init__(self, ndf=64):
        """
        DCGAN Discriminator

        Args:
            ndf: Number of discriminator filters (base)
        """
        super().__init__()

        # TODO 2: Implement discriminator architecture
        # Input: (batch, 3, 32, 32) - RGB image
        # Output: (batch, 1) - Real/Fake score
        #
        # Architecture (DCGAN paper):
        # Conv2d: 3 → ndf (16×16) [NO BatchNorm on first layer]
        # LeakyReLU(0.2)
        # Conv2d: ndf → ndf*2 (8×8)
        # BatchNorm + LeakyReLU(0.2)
        # Conv2d: ndf*2 → ndf*4 (4×4)
        # BatchNorm + LeakyReLU(0.2)
        # Conv2d: ndf*4 → 1 (1×1)
        # Sigmoid (probability real)

        # Your code here
        pass

    def forward(self, x):
        # TODO 2: Implement forward pass
        # x: (batch, 3, 32, 32)
        # Returns: Real/fake score (batch, 1)

        # Your code here
        pass

Part 3: GAN Loss Functions (YOU COMPLETE)

TODO 3: Implement minimax loss for generator and discriminator

python

def discriminator_loss(d_real_output, d_fake_output):
    """
    Discriminator loss: maximize log(D(x)) + log(1 - D(G(z)))

    Args:
        d_real_output: Discriminator scores on real images
        d_fake_output: Discriminator scores on fake images

    Returns:
        d_loss: Discriminator loss
    """
    # TODO 3a: Implement discriminator loss
    # Real loss: BCE(d_real_output, ones)  - want D(real) = 1
    # Fake loss: BCE(d_fake_output, zeros) - want D(fake) = 0
    # Total: real_loss + fake_loss

    # Your code here
    pass

def generator_loss(d_fake_output):
    """
    Generator loss: maximize log(D(G(z)))
    Equivalent to: minimize log(1 - D(G(z)))
    Non-saturating version: minimize -log(D(G(z)))

    Args:
        d_fake_output: Discriminator scores on fake images

    Returns:
        g_loss: Generator loss
    """
    # TODO 3b: Implement generator loss
    # Non-saturating GAN: BCE(d_fake_output, ones) - want D(G(z)) = 1

    # Your code here
    pass

Part 4: Training Loop (YOU COMPLETE)

TODO 4: Implement adversarial training loop

python

def train_gan(generator, discriminator, dataloader, num_epochs=25):
    """
    Train GAN with adversarial training

    Training procedure:
    1. Update Discriminator:
       - Compute D loss on real images
       - Compute D loss on fake images (from G)
       - Backprop and update D
    2. Update Generator:
       - Generate fake images
       - Compute G loss (fool D)
       - Backprop and update G
    """
    # Optimizers
    g_optimizer = torch.optim.Adam(generator.parameters(), lr=0.0002, betas=(0.5, 0.999))
    d_optimizer = torch.optim.Adam(discriminator.parameters(), lr=0.0002, betas=(0.5, 0.999))

    for epoch in range(num_epochs):
        for i, (real_images, _) in enumerate(dataloader):
            batch_size = real_images.size(0)
            real_images = real_images.to(device)

            # TODO 4a: Update Discriminator
            # Step 1: Discriminator loss on real images
            #   - Forward pass: d_real = discriminator(real_images)
            #   - Real loss
            # Step 2: Discriminator loss on fake images
            #   - Generate fakes: z = randn, fake_images = generator(z)
            #   - Forward pass: d_fake = discriminator(fake_images.detach())
            #   - Fake loss
            # Step 3: Total D loss and update
            #   - d_loss = real_loss + fake_loss
            #   - Backprop and step d_optimizer

            # Your code here for discriminator update

            # TODO 4b: Update Generator
            # Step 1: Generate fake images
            #   - z = randn, fake_images = generator(z)
            # Step 2: Generator loss
            #   - Forward pass: d_fake = discriminator(fake_images)
            #   - g_loss = generator_loss(d_fake)
            # Step 3: Update
            #   - Backprop and step g_optimizer

            # Your code here for generator update

            # Log progress
            if i % 100 == 0:
                print(f"[{epoch}/{num_epochs}][{i}/{len(dataloader)}] "
                      f"D_loss: {d_loss.item():.4f} G_loss: {g_loss.item():.4f}")

Part 5: Training Stabilization (YOU COMPLETE)

TODO 5: Implement stabilization techniques

python

def train_gan_stable(generator, discriminator, dataloader, num_epochs=25):
    """
    Train GAN with stabilization techniques
    """
    # TODO 5: Add stabilization techniques
    # 1. Label smoothing: real labels = 0.9 instead of 1.0
    # 2. Noisy labels: flip labels occasionally (5%)
    # 3. Gradient penalty: Add to discriminator loss
    # 4. Feature matching: Match discriminator features (optional)

    # Example: Label smoothing
    real_label = 0.9
    fake_label = 0.0

    for epoch in range(num_epochs):
        for i, (real_images, _) in enumerate(dataloader):
            # TODO 5: Implement stabilized training
            # Same structure as TODO 4, but with:
            # - Smoothed labels
            # - Optional label flipping
            # - Gradient clipping

            # Your code here
            pass

Part 6: Mode Collapse Detection (PRE-BUILT)

Pre-built utilities to detect and visualize mode collapse:

Sample diversity metrics
Class distribution analysis (for CIFAR-10)
Inception Score computation
Visual inspection tools

Features:

Generate 1000 samples
Classify with pre-trained model
Plot class distribution
Alert if <50% classes represented

Part 7: Conditional GAN (YOU COMPLETE)

TODO 6: Implement conditional GAN for class-specific generation

python

class ConditionalGenerator(nn.Module):
    def __init__(self, num_classes=10, latent_dim=100, ngf=64):
        super().__init__()

        # TODO 6a: Modify generator to accept class labels
        # Approach 1: Concatenate z and one-hot label
        # Approach 2: Conditional batch normalization

        # Your code here
        pass

    def forward(self, z, labels):
        # TODO 6a: Forward pass with conditioning
        # z: (batch, latent_dim)
        # labels: (batch,) - class indices

        # Your code here
        pass

class ConditionalDiscriminator(nn.Module):
    def __init__(self, num_classes=10, ndf=64):
        super().__init__()

        # TODO 6b: Modify discriminator to accept class labels
        # Approach: Concatenate image and class embedding

        # Your code here
        pass

    def forward(self, x, labels):
        # TODO 6b: Forward pass with conditioning
        # Your code here
        pass

Expected Results

Part 1-4: Basic GAN Training

Training Progress (CIFAR-10, 25 epochs):

ini

[0/25][0/782] D_loss: 1.42 G_loss: 0.89
[5/25][0/782] D_loss: 0.68 G_loss: 1.35
[10/25][0/782] D_loss: 0.52 G_loss: 1.78
[20/25][0/782] D_loss: 0.47 G_loss: 1.92

✓ Discriminator converges to ~0.5 (balanced)
✓ Generator loss increases (harder to fool D)
✓ Training stable (no divergence)

Generated Samples (Epoch 25):

java

✓ Recognizable objects (cars, planes, animals)
✓ Realistic colors and textures
✓ Sharper than VAE outputs
✓ Some artifacts but generally good quality

Part 5: Stabilized Training

With Stabilization Techniques:

✓ Smoother loss curves
✓ Fewer training failures
✓ Better sample diversity
✓ Faster convergence

Comparison:

yaml

Standard GAN: 30% training runs diverge
Stabilized GAN: 5% training runs diverge

✓ 6× more reliable training

Part 6: Mode Collapse Analysis

Healthy Training:

yaml

Generated 1000 samples
Class distribution:
- Airplane: 105 (10.5%)
- Automobile: 98 (9.8%)
- Bird: 102 (10.2%)
- ...
- Truck: 95 (9.5%)

✓ All 10 classes represented
✓ Balanced distribution
✓ No mode collapse detected

Mode Collapse Example:

yaml

Generated 1000 samples
Class distribution:
- Airplane: 0 (0%)
- Automobile: 823 (82.3%)  ← Mode collapse!
- Bird: 0 (0%)
- ...

✗ Only 2/10 classes generated
✗ Mode collapse detected!

Part 7: Conditional GAN

Class-Specific Generation:

sql

Generate 10 "airplane" images:
✓ All 10 are recognizable airplanes
✓ Different poses/colors (diversity)

Generate 10 "dog" images:
✓ All 10 are recognizable dogs
✓ Different breeds/positions

✓ Conditional generation works!

Success Criteria

Your implementation is complete when:

Generator produces recognizable 32x32 CIFAR-10 images
Discriminator loss stabilizes around 0.5-0.7
Training completes 25 epochs without diverging
Generated samples are sharper than VAE outputs (visual comparison)
No mode collapse (all 10 CIFAR-10 classes represented)
Conditional GAN generates correct class-specific images

Tips for Success

GAN Training Debugging

Common Failures:

One. Generator Collapse (outputs noise):

Symptom: G loss stuck at high value, D loss -> 0
Cause: Discriminator too strong, generator can't learn
Fix:
- Train D less frequently (1 D update per 2 G updates)
- Use weaker discriminator architecture
- Add noise to D inputs

2. Mode Collapse:

Symptom: Generator outputs same image repeatedly
Cause: Generator finds "easy" solution to fool D
Fix:
- Minibatch discrimination
- Feature matching
- Use diverse training data

3. Training Instability:

Symptom: Losses oscillate wildly
Cause: Non-convergent adversarial dynamics
Fix:
- Label smoothing
- Lower learning rates
- Gradient penalty (WGAN-GP)

4. Vanishing Gradients:

Symptom: G loss -> 0, but samples are bad
Cause: Saturating sigmoid in D
Fix:
- Use non-saturating loss: -log(D(G(z)))
- Wasserstein GAN (WGAN)

Architecture Guidelines (DCGAN)

DO:

✅ Use strided convolutions (no pooling)
✅ Use BatchNorm (except D first layer, G output layer)
✅ Use ReLU in G, LeakyReLU(0.2) in D
✅ Use Tanh in G output, Sigmoid in D output

DON'T:

❌ Fully-connected layers (use convolutions)
❌ Max pooling (use strided conv)
❌ Batch norm on D input or G output

Hyperparameter Tuning

Parameter	Recommended	Effect
Latent dim	100	Standard for DCGAN
Learning rate	0.0002	Both G and D
Beta1 (Adam)	0.5	Lower than default (0.9)
Batch size	128	Larger = more stable
Epochs	25-50	More = better quality

Extension Challenges

Challenge One: Progressive GAN (Hard)

Implement progressive training (start 4x4, grow to 32x32):

python

# Start with low resolution
train_gan(g_4x4, d_4x4, epochs=10)

# Upsample and add layers
g_8x8 = grow_generator(g_4x4)
d_8x8 = grow_discriminator(d_4x4)
train_gan(g_8x8, d_8x8, epochs=10)

# Continue to 32×32

Benefit: More stable training, higher quality

Challenge 2: Wasserstein GAN (WGAN-GP) (Hard)

Replace BCE loss with Wasserstein distance:

python

def wgan_d_loss(d_real, d_fake):
    return -torch.mean(d_real) + torch.mean(d_fake)

def wgan_g_loss(d_fake):
    return -torch.mean(d_fake)

def gradient_penalty(discriminator, real, fake):
    """Compute gradient penalty for WGAN-GP"""
    pass

Benefit: More stable, better convergence

Challenge 3: Self-Attention GAN (SAGAN) (Hard)

Add self-attention layers to G and D:

python

class SelfAttention(nn.Module):
    def __init__(self, in_channels):
        # Compute attention maps
        # Apply to feature maps
        pass

Benefit: Better global coherence

Challenge 4: FID Score Evaluation (Medium)

Compute Fréchet Inception Distance:

python

from scipy.linalg import sqrtm

def compute_fid(real_features, fake_features):
    """
    FID = ||mu_real - mu_fake||^2 + Tr(Sigma_real + Sigma_fake - 2*sqrt(Sigma_real * Sigma_fake))
    """
    pass

Lower FID = better quality

Submission Requirements

What to Submit

Completed Notebook: activity-11-generative-adversarial-networks.ipynb
- All code cells executed
- Training completed (25 epochs minimum)
- All TODOs completed
Generated Samples:
- Grid of 100 generated images (10x10)
- Training progression (epochs 1, 5, 10, 25)
- Conditional generation examples (all 10 classes)
- Mode collapse analysis results
Training Curves:
- Discriminator loss over time
- Generator loss over time
- Inception Score progression
Comparison:
- GAN vs VAE samples (side-by-side)
- Qualitative analysis of sharpness difference
Analysis (5-7 sentences):
- What stabilization techniques helped most?
- Did you encounter mode collapse? How did you fix it?
- When would you use GANs vs VAEs?

Submission Steps

Train GAN for 25 epochs
Generate samples and run analysis
Complete comparison with VAE (Activity 10)
Download notebook
Submit via [course portal link]

Resources

Documentation

DCGAN Paper (Radford et al., 2015)
PyTorch DCGAN Tutorial
GAN Hacks
soumith/ganhacks
View on GitHub
(Training tips)

Papers

Original GAN Paper (Goodfellow et al., 2014)
Improved Training of GANs (Salimans et al., 2016)
Progressive GAN (Karras et al., 2017)

Adversarial training
Nash equilibrium
Mode collapse
Inception Score / FID

Next Steps

Next Activity: Activity 12 - Advanced GAN Architectures

StyleGAN for high-resolution faces
Conditional generation techniques
Image-to-image translation (pix2pix, CycleGAN)

Assessment

This activity is graded on:

Code Completion (35%): All TODOs implemented correctly
Training Success (30%): GAN converges, generates realistic images
Stability (20%): Training completes without divergence
Analysis (15%): Demonstrates understanding of GAN challenges

Passing Grade: 70% or higher

Congratulations on building your first GAN! 🎉🎨

Activity 11 of 18