CS 5043: HW5: Probabilistic Neural Networks

DRAFT

Assignment notes:

The Problem

The Oklahoma Mesonet is a network of weather stations scattered across the state of Oklahoma, with at least one station in each county. Each station measures many different meteorological variables every 5 minutes. Our data set contains a single summary sample for each of the 136 stations and each day from 1994 to 2000.

The measured variables are described in the Mesonet Daily Summary Data document. For this assignment we will be predicting the rainfall for the day (RAIN) given the other measured variables at the station. These variables are:

The Data set is available on SCHOONER: /home/fagg/datasets/mesonet/allData1994_2000.csv

Supporting Code

The supporting code provides key functionality.

Deep Learning Experiment

Construct a model that takes as input the data from a single mesonet station (a row in ins_*) and predicts a distribution of likely rainfall measurements, conditioned on the station data (a row in outs_*). Use the inner/outer model design, with the inner model transforming the station data into a set of parameters for a Sinh-Arcsinh distribution, and the outer model producing as output the corresponding distribution.

Model specifics:

Performance Reporting

Once you have selected a reasonable architecture and set of hyper-parameters, perform six rotations of experiments. Produce the following figures/results:
  1. Figure 0: Inner network architecture from plot_model().

  2. Figures 1a,b: Training and validation set negative likelihood as a function of epoch for each rotation (each figure has six curves).

  3. Figure 2: Several time-series examples from a test data set. Show observed precipitation, and curves for distribution mean, and the 10, 25, 75, and 90th distribution percentiles. Make sure to pick interesting time periods.

  4. Figures 3a,b,c,d: Combining the test data for all six rotations, show a scatter plot of predicted mean, standard deviation, skewness, and tailweight as a function of observed precipitation.

  5. Figure 4: For each rotation, compute the root mean squared difference between observed precipitation, and both the median and mean predicted precipitation. Show these RMSDs using a bar plot with twelve bars. Organize logically.

  6. Reflection:
    1. Discuss in detail how consistent your model performance is across the different rotations.

    2. Given the time-series plots, describe and explain the shape of the pdf and how it changes with time.

    3. Discuss how skewness is used by the model. Is there a consistent variation in this distribution parameter?

    4. Discuss how tailweight is used by the model. Is there a consistent variation in this distribution parameter?

    5. Is Sinh-Arcsinh an appropriate distribution for modeling this particular phenomena? Why or why not? (answer in detail)


Hints


What to Hand In

Turn in a single zip file that contains:

Do not turn in pickle files.

Grading


andrewhfagg -- gmail.com

Last modified: Wed Apr 2 00:10:26 2025