CS 5043: HW5: Probabilistic Neural Networks

Assignment notes:

The Problem

The Oklahoma Mesonet is a network of weather stations scattered across the state of Oklahoma, with at least one station in each county. Each station measures many different meteorological variables every 5 minutes. Our data set contains a single summary sample for each of the 136 stations and each day from 1994 to 2000.

The measured variables are described in the Mesonet Daily Summary Data document. For this assignment we will be predicting the rainfall for the day (RAIN) given the other measured variables at the station. These variables are:

The Data set is available on SCHOONER: /home/fagg/datasets/mesonet/allData1994_2000.csv

Supporting Code

The supporting code provides key functionality.

Deep Learning Experiment

Construct a model that takes as input the daily summary data from a single mesonet station (a row in ins_*) and predicts a distribution of likely rainfall measurements, conditioned on the station data (a row in outs_*). Use the inner/outer model design, with the inner model transforming the daily summary station data into a set of parameters for a Sinh-Arcsinh distribution, and the outer model producing as output the corresponding distribution.

Model specifics:

Performance Reporting

Once you have selected a reasonable architecture and set of hyper-parameters, perform eight rotations of experiments. Produce the following figures/results:
  1. Figure 0: Inner network architecture from plot_model().

  2. Figures 1a,b: Training and validation set negative likelihood as a function of epoch for each rotation (each figure has eight curves).

  3. Figure 2: Several time-series examples from a test data set. Show observed precipitation, and curves for distribution mean, and the 10, 25, 75, and 90th distribution percentiles. Make sure to pick interesting time periods.

  4. Figures 3a,b,c,d: Combining the test data for all eight rotations, show a scatter plot of predicted mean, standard deviation, skewness, and tailweight as a function of observed precipitation.

  5. Figure 4: For each rotation, compute the mean absolute difference between observed precipitation, and both the median and mean predicted precipitation. Show these MADs using a bar plot with twelve bars. Organize logically.

  6. Reflection:
    1. Discuss in detail how consistent your model performance is across the different rotations.

    2. Given the time-series plots, describe and explain the shape of the pdf and how it changes with time.

    3. Discuss how skewness is used by the model. Is there a consistent variation in this distribution parameter?

    4. Discuss how tailweight is used by the model. Is there a consistent variation in this distribution parameter?

    5. Is Sinh-Arcsinh an appropriate distribution for modeling this particular phenomena? Why or why not? (answer in detail)

    6. Are your models doing a good job at predicting precipitation? Justify your answer.


Hints


What to Hand In

Turn in a single zip file that contains:

Do not turn in pickle files.

Grading


andrewhfagg -- gmail.com

Last modified: Wed Apr 2 23:03:39 2025