CS 5043: HW6: Advanced RNNs and Attention

Assignment notes:

The Problem

We are using the same problem as in the previous homework assignment. However, we will be using advanced RNN-style architectures.

Data Set

The data are the same as in HW 5.

Deep Learning Experiment

Objective: Create an advanced RNN and an Attention-based model to perform the amino acid family classification. Implement two architectures. Each will have the form:

The two architectures:

  1. Stack of one or more GRU/LSTM layers
  2. Stack of Attention layers. I recommend investigating tf.keras.layers.MultiHeadAttention

Performance Reporting

Once you have selected a reasonable architecture and set of hyper-parameters, produce the following figures:
  1. Figure 0a,b: Network architectures from plot_model()

  2. Figure 1: Training set Accuracy as a function of epoch for each rotation of five rotations.

  3. Figure 2: Validation set Accuracy as a function of epoch for each of the rotations.

  4. Figure 3: Scatter plot of Test Accuracy for the GRU and Attention models.

  5. Figure 4: Scatter plot of training epochs for the GRU and Attention models.

  6. Reflection: answer the following questions:
    1. For your Multi-Headed Attention implementation, explain how you translated your last layer into an output probability distribution

    2. Is there a difference in performance between the two model types?

    3. How much computation did you need for the training for each model type?


What to Hand In

Turn in a single zip file that contains:

Grading

References

Hints


andrewhfagg -- gmail.com

Last modified: Sun Apr 2 16:26:51 2023