CS 5043: HW0
Executing DL Experiments on the Supercomputer
Objectives:
- Implement a shallow network that is capable of learning to reproduce a function.
- Implement experiment control code that executes a single
instance of a learning run and stores the results in a pickle
file.
- Use the supercomputer to execute a set of experiments.
- Implement a tool that brings the results together from the
different experiments so they can be represented in a common
set of figures. (this can be executed locally or on the supercomputer)
Assignment notes:
- Deadline: Thursday, February 9th @11:59pm. Solutions may be
submitted up to 48 hours late, with a 10% penalty for each 24
hours. Note that HW1 will be assigned and due on the original
schedule.
- This work is to be done on your own. While general discussion
about Python, TensorFlow and Keras is okay, sharing
solution-specific code is inappropriate. Likewise, you may not
download code solutions to this problem from the network.
Data Set
All data sets for the class are available on the supercomputer in the
directory /home/fagg/datasets/aml/. You do not need
to keep copies of these datasets in your own home directory on the
supercomputer.
For this homework assignment, we will be using hw0_dataset.pkl.
This file contains one python object, a
dictionary, that has two keys: ins and outs. The
following code snippet will load the data into your python environment:
fp = open("hw0_dataset.pkl", "rb")
foo = pickle.load(fp)
fp.close()
Notes:
- This code snippet assumes that this file is in the same directory as
your notebook/python code.
- There is only a training set (and no validation or test sets).
So, you will need to "fake" a validation set to use EarlyStopping.
- Take some time to visually examine the data.
Part 1: Network
-
Write a function that constructs a relatively shallow neural network
that can regenerate the output from the corresponding inputs.
-
The first thing that you try will likely not work. You will
need to think about the appropriate non-linearities to use and
to play with the number of layers and neurons.
- Once you have settled on an architecture that works well, then
proceed to the next part.
Part 2: Multiple Runs
Hints
Expectations
- Terminal MSE for the individual runs should be very low. If
this is not the case, then go back to your network design.
- It is very hard to learn a network that generates the correct
output for every example. Getting a couple examples wrong in
individual runs is okay.
What to Hand-In
Submit a zip file to the HW0 section of Gradescope (enter via Canvas)
that contains:
- your python code,
- if you use a Jupyter notebook, then submit a pdf copy of the
notebook,
- In Jupyter Lab: Use File/Save and Export File As/PDF
- your batch file,
- the learning curve figure,
- the histogram, and
- the stdout files from each of the 10 experiments.
Grading
- 30 pts: low MSE for every run
- 20 pts: few prediction absolute errors greater than 0.1
- 20 pts: well structured code
- 10 pts: appropriate documentation
- 10 pts: figures are well formatted and have appropriate axis
labels
- 10 pts: 10 stdout files
- 10 pts (bonus): all absolute prediction errors less than 0.1
andrewhfagg -- gmail.com
Last modified: Thu Feb 2 13:28:25 2023