CS 5043: HW7: Conditional Generative Adversarial Networks

Assignment notes:

The Problem

For this assignment, we will create a network that generates synthetic satellite images given a semantic labeling of an image. Specifically, we will come back to the Chesapeake Bay data set. The inputs to our generator will be a 1-hot encoded representation of pixel-level class labels (7 classes in total) and a set of tensors that contain random values. In response, the generator will produce a RGB image that plausibly matches the semantic labels. Note that this is not a one-to-one mapping: for a single semantic input, the generator should produce different images over a set of queries. This is possible because:

Deep Learning Experiment

Implement a Generative Adversarial Network to solve this problem. This GAN requires the implementation of three different Keras Models; the two base models are very similar to what you implemented in HW 3 and 4:
  1. A discriminator model (a classifier!) that takes as input two input tensors: a RGB image and the semantic labeling of the image, and produces as output a probability that the input image is real and corresponds to the semantic input. We will train this model on its own.

  2. A generator model (a U-net!) that takes as input the semantic label image and several noise tensors, and produces a RGB image as output. We will not directly train this model - instead, it is just a means for producing synthetic images.

  3. A "meta-model" that combines the two. This meta-model takes as input the semantic label image and several noise tensors, and produces a scalar output probability that the generated image is real (and corresponds to the semantic input). We will use this model to train the generator.

Data Set

We will use the same Chesapeake data loader as we did in HW 4. The provided notebook file gives an example of using this loader such that your data sets will be composed of batches of Image / Semantic Label pairs.

For the sake of computational feasibility, we will be using image_size=64 (i.e., 64x64 images).

Training Process

With GANs, we have two models that are attempting to satisfy competing objectives: the discriminator is trying to differentiate real from generated images, while the generator is attempting to produce images that fool the discriminator. Training the combined system requires us to switch between performing one or more epochs of gradient descent on the discriminator and then performing one or more epochs on the meta-model. This process is repeated over a large set of meta epochs.

As we discussed in class and is provided in gan_train_loop.py, for each meta epoch, we sample three batches from the training data set. These are:

U-Net Details

The U-net that you are implementing is not unlike what you implemented in HW 4. The key difference is that we are also bringing in random tensors. These random tensors should be connected to the decoder side of the U-net. I recommend that this be done using concatenation (and not addition). The connections should be:

For a 64x64 base image, and two down and up stages in the U, you will need three random tensors whose rows and columns match the image size of that stage. Their shapes will be:

The training loop includes code that will generate the random numpy arrays of these shapes.

Experiments

Settle on a complete architecture implementation. You are welcome to use your HW3 and HW4 model building implementations as starting points. Or - you may use the provided network_support.py implementation tools. Get this model working well.

Next, make one change to this model and train it to its best performance. Possible experiments to try include:

Once you have completed your experiment with the two models, produce the following:
  1. Figure 0a,b,c,d: Model architectures from plot_model() (including the two different generators).

  2. Figure 1a,b: for each generator, show an interesting set of 25 examples (i.e., there should be good variety in the semantic label inputs, but there should also be a couple of examples where the semantic labels are identical).

  3. Figure 2a,b: for each discriminator, show the distribution of the output probabilities. For each discriminator, there should be three histograms (one for each batch type).

  4. Reflection: answer the following questions:
    1. Describe your experimental modification to the model.

    2. Describe how this modification changed the model's performance. Answer in detail.


What to Hand In

Turn in a single zip file that contains:

Grading

Hints

Frequently Asked Questions


andrewhfagg -- gmail.com

Last modified: Sat Apr 20 14:08:55 2024