Appendices
Appendix 1: Source Code and Data Repository
The Python scripts and data associated with this tutorial are available in the GitHub Repository.
The project is structured around two main directories:
data
Contains the datasets used in the tutorial.scripts
Contains the various Python scripts presented throughout the tutorial sections.
Repository structure
[root project directory]
├── .gitignore
├── LICENSE
├── README.md
├── requirements.txt
├── scripts/
│ ├── forecast_discharge_mono.py
│ ├── forecast_discharge_geozones.py
│ ├── forecast_discharge_multi.py
│ ├── generate_series.py
│ ├── get_discharge_stations.py
│ ├── stations_map.py
│ └── station_download_daily.py
├── data/
│ ├── ERA5/
│ │ ├── era5_z1.csv
│ │ └── era5_z3.csv
│ ├── rivers/
│ │ ├── 05331000_daily_data.csv
│ │ ├── ...
│ │ └── stations_reminder.txt
└── assets/
│ └── progressive_journey.jpg
Installation
Clone the repository:
git clone https://github.com/grezac/mississippi_forecasting_tutorial
In the folder created by Git, create a virtual environment (e.g., using: python -m venv .venv) and activate it.
Install the required Python dependencies:
pip install -r requirements.txt
Warning
Installing the dependencies using pip install is very important for achieving results that are similar to or identical to those shown in this tutorial
Running the Scripts
From the project root directory, you can run the scripts using the following command:
python scripts/[script_name].py
Example:
python scripts/forecast_discharge_mono.py
Appendix 2: From Tutorial to an Operational Forecasting System
The objective of this tutorial is to illustrate the complete construction of a forecasting pipeline while keeping the methodology sufficiently simple and reproducible. An operational system would naturally deserve further investigation in several directions.
From a hydrological perspective, the information currently used could be enriched by incorporating additional upstream gauging stations or by considering variables other than discharge, such as gage height when available. The selection of stations itself could also be revisited in order to better capture the dynamics of the Mississippi basin and its major tributaries.
The meteorological component also offers many possibilities for improvement. Although ERA5 total precipitation provides a convenient and globally available dataset, an operational implementation in the United States could instead rely on observed precipitation products (for example NOAA or radar-based datasets). Additional variables, such as snow depth or indicators related to snowmelt, may also prove useful in regions where winter processes significantly influence river discharge. Likewise, the spatial extent of the meteorological aggregation could be optimized by testing smaller or differently defined bounding boxes.
Finally, model evaluation could be refined beyond the global performance indicators presented in this tutorial. Assessing predictive skill separately for different seasons, high-flow events, low-flow periods, or flood episodes would provide a more complete picture of the model's operational value and help identify situations in which further improvements are needed.