CS 5043: HW5: Probabilistic Neural Networks
DRAFT
Assignment notes:
- Deadline: Thursday, April 10th @11:59pm.
- Hand-in procedure: submit a zip file to Gradescope
- This work is to be done on your own. However, you may share
solution-specific code snippets in the open on
Slack (only!), but not full solutions. Downloading
solution-specific materials (text and code) is not allowed
(from web pages or from LLMs).
The Problem
The Oklahoma
Mesonet is a network of weather stations scattered across the
state of Oklahoma, with at least one station in each county. Each
station measures many different meteorological variables every 5
minutes. Our data set contains a single summary sample for each
of the 136 stations and each day from 1994 to 2000.
The measured variables are described in the
Mesonet
Daily Summary Data document. For this assignment we will be
predicting the rainfall for the day (RAIN) given the other measured variables
at the station. These variables are:
- TMAX
- TMIN
- TAVG
- DMAX
- DMIN
- DAVG
- VDEF
- SMAX
- SMIN
- SAVG
- BMAX
- BMIN
- BAVG
- HMAX
- HMIN
- HAVG
- PMAX
- PMIN
- PAVG
- MSLP
- AMAX
- ATOT
- WSMX
- WSMN
- WSPD
- WDEV
- WMAX
- 9AVG
- 2MAX
- 2MIN
- 2AVG
- 2DEV
- HDEG
- CDEG
- HTMX
- WCMN
The Data set is available on SCHOONER:
/home/fagg/datasets/mesonet/allData1994_2000.csv
Supporting Code
The supporting code provides key
functionality.
- Loading Datasets
get_mesonet_folds(dataset_fname:str,
ntrain_folds: int = 6,
nvalid_folds: int = 1,
ntest_folds: int = 1,
rotation: int = 0)
-
Returns numpy arrays: ins_training, outs_training, ins_validation,
outs_validation, ins_testing, outs_testing
- Each fold contains different Mesonet stations
- Sinh-Arcsinh Distribution Implementation
The SinhArcsinh class provides three key class methods:
- num_params() returns the number of parameters required
for this distribution. Each parameter will require one Tensor
(one each for mean, standard deviation, skewness, and
tailweight).
- create_layer() returns a proper Keras 3 Layer. This layer
is callable with a sequence of 4 Keras Tensors. The
implementation assumes that standard deviation and tailweight
are only positive. When passed TF Tensor data, the layer returns Tensorflow
Probability Distributions (not TF Tensors)
- mdn_loss(y, dist) Can be used as a loss function for the
purposes of compiling your outer model. It returns the negative log
likelihood for each true value (y) given the parameterized
distribution (dist).
- Probabilistic Neural Networks Demo: pnn-solution.ipynb
- Synthetic data
- Normal distribution
Deep Learning Experiment
Construct a model that takes as input the data from a single mesonet
station (a row in ins_*) and predicts a distribution of likely
rainfall measurements, conditioned on the station data (a row in
outs_*). Use the inner/outer model design, with the inner model
transforming the station data into a set of parameters for a
Sinh-Arcsinh distribution, and the outer model producing as
output the corresponding distribution.
Model specifics:
- The Sinh-Arcsinh distribution has four input parameters:
mean, standard deviation, skewness and tailweight. Each of
these parameters is actually a vector -- one element for each
input example (specifically, these parameter values are
conditioned on the mesonet station data).
- The standard deviation and tailweight must be strictly positive.
Your inner model must enforce this (the standard is to use the
softplus non-linearity). The other two parameters are unbounded.
- Use negative log likelihood as the loss function.
- Make sure to allocate an appropriate set of hidden layers (and
hidden layer sizes) for your inner model.
Performance Reporting
Once you have selected a reasonable architecture and set of
hyper-parameters, perform six rotations of experiments. Produce the
following figures/results:
- Figure 0: Inner network architecture from plot_model().
- Figures 1a,b: Training and validation set negative likelihood
as a function of epoch for each rotation (each figure has six
curves).
- Figure 2: Several time-series examples from a test data
set. Show observed precipitation, and curves for
distribution mean, and the 10, 25, 75, and 90th distribution percentiles.
Make sure to pick interesting time periods.
- Figures 3a,b,c,d: Combining the test data for all six rotations, show a
scatter plot of predicted mean, standard deviation, skewness,
and tailweight as a function of observed precipitation.
- Figure 4: For each rotation, compute the root mean squared difference between
observed precipitation, and both the median and mean predicted precipitation.
Show these RMSDs using a bar plot with twelve bars. Organize logically.
- Reflection:
- Discuss in detail how consistent your model performance
is across the different rotations.
- Given the time-series plots, describe and explain the
shape of the pdf and how it changes with time.
- Discuss how skewness is used by the model. Is there a
consistent variation in this distribution parameter?
- Discuss how tailweight is used by the model. Is there a
consistent variation in this distribution parameter?
- Is Sinh-Arcsinh an appropriate distribution for modeling
this particular phenomena? Why or why not? (answer in detail)
Hints
- You should be working from the structure of the PNN demo code
that we released this week (inner/outer model design, with the
outer model returning a probability distribution).
- model_outer.predict(...) will return a sequence of samples from the
learned distribution (one sample per example in the input,
conditioned on the corresponding input).
- model_outer(...) will return a sequence of distributions (one
distribution per example in the input, also conditioned on the
inputs)
- The range of the input variables varies dramatically depending
on the variable itself. Make sure to add a batch normalization step
between your inputs and your first hidden layer. You might
also see some benefit to batch normalization at other stages of
your network, or by using kernel initializations that limit the
magnitude of your initial parameters (but don't set your weights
to zeros)
- Be patient with your training. You can gain a lot from this
- See the SinhArcsinh distribution documentation for things you can do with parameterized distributions (useful for creating some of the figures)
- Remember that Tensorflow Probability is very sensitive to the
specific combination of package versions that are available in
the dnn environment
- You won't need to use a GPU for this assignment
What to Hand In
Turn in a single zip file that contains:
- All of your python code (.py) and any notebook files (.ipynb)
- Figures 0-4
- Reflection
Do not turn in pickle files.
Grading
- 10 pts: Clean, general code for model building (including
in-code documentation)
- 10 pts: Figure 0
- 10 pts: Figure 1
- 10 pts: Figure 2
- 10 pts: Figure 3
- 10 pts: Figure 4
- 15 pts: Reasonable test set performance for all rotations.
- 25 pts: Reflection
andrewhfagg -- gmail.com
Last modified: Wed Apr 2 00:10:26 2025