Demo Mode

No student ID available

Project 3 of 7

Project 3: GAN Art Studio

Apply your knowledge to build something amazing!

Project 3: GAN Art Studio

Duration: 2 weeks Points: 100 Prerequisites: Complete Lessons 11-12 (GANs, Advanced GAN Architectures) Difficulty: Intermediate-Advanced

Project Overview

In this project, you'll build a generative art system using advanced GAN architectures. You'll train a StyleGAN-inspired model to generate high-quality, diverse artworks and create an interactive web application where users can explore the latent space, perform style mixing, and generate custom art pieces. This project demonstrates the creative potential of generative AI.

Why This Matters: GANs have revolutionized creative AI, powering applications from DALL-E to This Person Does Not Exist. This project gives you hands-on experience with the techniques behind modern generative art platforms.

What You'll Build:

StyleGAN-based generator with style injection
Progressive training pipeline for high-resolution generation
Interactive web app (Gradio/Streamlit) for art generation
Style mixing and latent space exploration features
Portfolio-ready art gallery with generated images

Learning Objectives

By completing this project, you will:

Implement advanced GAN architecture (StyleGAN-inspired)
Train GAN on artistic datasets (faces, landscapes, abstract art)
Apply progressive training and style-based generation
Create interactive interface for latent space exploration
Evaluate GAN quality using FID, Inception Score, and human assessment
Deploy art generation system as web application

Requirements

Functional Requirements

Your GAN Art Studio must:

Generate high-quality images: ``Resolution >=256``x256, visually coherent
Support style mixing: Combine styles from multiple latent codes
Enable latent space exploration: Smooth interpolation between images
Achieve quality metrics:
- FID (Fréchet Inception Distance) < 50
- Inception ``Score > 5.0``
- Human assessment ``score > 70``% ("looks realistic/artistic")
Train on custom dataset: Choose artistic domain (faces, landscapes, abstract)
Interactive web interface: User-friendly UI for generating and exploring art

Technical Requirements

Your implementation must include:

Generator: StyleGAN-inspired architecture with style injection layers
Discriminator: Progressive discriminator with minibatch standard deviation
Training: Wasserstein loss with gradient penalty (WGAN-GP) or R1 regularization
Progressive Training (Optional): Start at low resolution, gradually increase
Metrics: FID and Inception Score calculation
Web App: Gradio or Streamlit interface with real-time generation

Code Structure

graphql

project-03-gan-art-studio/
├── README.md                     # Project documentation
├── requirements.txt              # Python dependencies
├── stylegan_generator.py        # StyleGAN generator implementation
├── discriminator.py             # Discriminator architecture
├── train.py                     # Training script
├── metrics.py                   # FID and IS calculation
├── app.py                       # Gradio/Streamlit web interface
├── data/                        # Training dataset
├── models/                      # Saved checkpoints
│   └── generator_best.pth
├── generated/                   # Generated art gallery
│   ├── sample_001.png
│   └── ...
└── logs/                        # Training logs

Grading Rubric

Criterion	Points	Description
GAN Implementation	30	Correct StyleGAN-inspired architecture
Image Quality	25	Generated images are high-quality and diverse
Style Features	15	Style mixing and latent interpolation work
Web Application	15	Interactive, user-friendly interface
Metrics	10	FID and IS calculated correctly
Documentation	5	Clear README and art gallery
Total	100

Bonus Points (+10 each):

Implement full progressive training (64->128->256->512 resolution)
Add latent space disentanglement analysis (factor discovery)
Implement conditional generation (class-conditional or text-conditional)
Deploy to public website (Hugging Face Spaces, Vercel, Heroku)

Milestones

Week One: GAN Implementation and Training

Day 1-3: Dataset Preparation

Choose artistic domain (faces, landscapes, abstract art)
Collect and preprocess 5,000+ images
Verify data augmentation pipeline

Day 4-6: StyleGAN Architecture

Implement style-based generator
Implement discriminator with minibatch std
Test forward pass and loss calculation

Day 7: Initial Training

Train GAN on 64x64 or 128x128 resolution
Monitor training stability (watch for mode collapse)
Save checkpoints

Deliverable: Working GAN generating 128x128 images

Week 2: Optimization and Deployment

Day 8-10: High-Resolution Training

Train on 256x256 resolution (or higher if GPU allows)
Tune hyperparameters (learning rate, regularization)
Generate large sample gallery

Day 11-12: Interactive Features

Implement style mixing algorithm
Implement latent space interpolation
Calculate FID and Inception Score

Day 13-14: Web Application and Portfolio

Build Gradio/Streamlit interface
Deploy locally or to cloud
Create art gallery and documentation

Deliverable: Complete GAN Art Studio with web interface and metrics

Implementation Guide

StyleGAN-Inspired Generator

python

import torch
import torch.nn as nn

class StyleBasedGenerator(nn.Module):
    def __init__(self, latent_dim=512, image_size=256):
        super().__init__()

        # Mapping network: z → w (style code)
        self.mapping = nn.Sequential(
            nn.Linear(latent_dim, 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 512),
        )

        # Synthesis network with style injection
        self.const_input = nn.Parameter(torch.randn(1, 512, 4, 4))

        # Progressive layers: 4×4 → 8×8 → 16×16 → ... → 256×256
        self.synthesis = nn.ModuleList([
            StyleBlock(512, 512, 8),    # 8×8
            StyleBlock(512, 512, 16),   # 16×16
            StyleBlock(512, 256, 32),   # 32×32
            StyleBlock(256, 128, 64),   # 64×64
            StyleBlock(128, 64, 128),   # 128×128
            StyleBlock(64, 32, 256),    # 256×256
        ])

        # To RGB conversion at each resolution
        self.to_rgb = nn.ModuleList([
            nn.Conv2d(512, 3, kernel_size=1),  # 8×8
            nn.Conv2d(512, 3, kernel_size=1),  # 16×16
            nn.Conv2d(256, 3, kernel_size=1),  # 32×32
            nn.Conv2d(128, 3, kernel_size=1),  # 64×64
            nn.Conv2d(64, 3, kernel_size=1),   # 128×128
            nn.Conv2d(32, 3, kernel_size=1),   # 256×256
        ])

    def forward(self, z):
        # Map latent code to style code
        w = self.mapping(z)  # [batch, 512]

        # Start from constant input
        x = self.const_input.repeat(z.size(0), 1, 1, 1)

        # Progressive synthesis with style injection
        for i, (block, to_rgb) in enumerate(zip(self.synthesis, self.to_rgb)):
            x = block(x, w)

            # Can optionally blend RGB outputs (progressive training)
            if i == len(self.synthesis) - 1:  # Final layer
                img = to_rgb(x)

        return torch.tanh(img)  # Output in [-1, 1]

Style Mixing

python

def style_mixing(generator, z1, z2, mix_layers=3):
    """
    Mix styles from two latent codes
    z1: First latent code (coarse features: overall structure)
    z2: Second latent code (fine features: colors, textures)
    mix_layers: Number of layers to use z1 before switching to z2
    """
    # Map to style codes
    w1 = generator.mapping(z1)
    w2 = generator.mapping(z2)

    # Start with constant input
    x = generator.const_input.repeat(z1.size(0), 1, 1, 1)

    # Apply synthesis blocks with mixed styles
    for i, block in enumerate(generator.synthesis):
        # Use w1 for early layers (coarse), w2 for later layers (fine)
        w = w1 if i < mix_layers else w2
        x = block(x, w)

    # Final RGB conversion
    img = generator.to_rgb[-1](x)
    return torch.tanh(img)

Dataset Options

Recommended Datasets

Faces (Easy):

CelebA: 200K celebrity faces (aligned and cropped)
FFHQ: 70K high-quality faces (1024x1024)

Landscapes (Medium):

Landscape Pictures: 4K+ diverse landscapes
Places365: Subset of natural scenes

Abstract Art (Medium):

WikiArt: 80K artworks across styles and periods
Kaggle Painter by Numbers: 100K paintings

Custom (Hard):

Scrape images from Unsplash, DeviantArt, ArtStation
Minimum 5,000 images recommended

Hyperparameter Recommendations

Hyperparameter	Value	Notes
Learning rate (G)	1e-4	Adam, β1=0, β2=0.99
Learning rate (D)	4e-4	D trained more frequently
Batch size	16-32	Depends on GPU memory
Latent dim	512	Standard for StyleGAN
Gradient penalty λ	10	WGAN-GP coefficient
D steps per G step	1-5	Balance G and D

Web Application (Gradio)

python

import gradio as gr
import torch

# Load trained generator
generator = StyleBasedGenerator()
generator.load_state_dict(torch.load("models/generator_best.pth"))
generator.to('cuda')

def generate_art(seed, style_mixing_seed=None):
    """Generate art from seed"""
    torch.manual_seed(seed)
    z = torch.randn(1, 512).to('cuda')

    if style_mixing_seed is not None:
        torch.manual_seed(style_mixing_seed)
        z2 = torch.randn(1, 512).to('cuda')
        img = style_mixing(generator, z, z2, mix_layers=3)
    else:
        with torch.no_grad():
            img = generator(z)

    # Convert to PIL Image
    img = (img[0].permute(1, 2, 0).cpu().numpy() + 1) / 2  # [-1,1] → [0,1]
    return (img * 255).astype('uint8')

# Gradio interface
interface = gr.Interface(
    fn=generate_art,
    inputs=[
        gr.Slider(0, 10000, step=1, label="Random Seed"),
        gr.Slider(0, 10000, step=1, label="Style Mixing Seed (optional)"),
    ],
    outputs=gr.Image(label="Generated Art"),
    title="GAN Art Studio",
    description="Generate unique artworks using StyleGAN!",
)

interface.launch()

Resources

Documentation

Gradio Docs - Web interface framework
PyTorch FID
mseitzer/pytorch-fid
View on GitHub
- FID calculation library

Research Papers

StyleGAN (Karras et al., 2019): A Style-Based Generator Architecture
StyleGAN2 (Karras et al., 2020): Analyzing and Improving StyleGAN
WGAN-GP (Gulrajani et al., 2017): Improved Training of Wasserstein GANs

Submission Guidelines

Required Deliverables:

Code repository with generator, discriminator, training script
Trained model checkpoint (generator_best.pth)
Art gallery: 100 generated images showcasing diversity
Web application (deployed or local)
Technical report (FID, IS, analysis)

Deadline: 2 weeks from project start

Portfolio Presentation

Demo Website:

Title: "GAN Art Studio - AI-Generated Artistic Images"
Interactive widget for generating art
Gallery of best generated images
Technical highlights: "StyleGAN with FID score of 42"

LinkedIn/Resume:

"Built generative art system using StyleGAN, achieving FID score of 42 on custom artistic dataset. Deployed interactive web application for latent space exploration and style mixing."

Good luck! You're building the technology behind modern generative art.

Related Projects:

Project 4 - VAE Latent Space Explorer -> (Alternative generative model)

Project 3 of 7