Skip to content

Introduction and objective

In this tutorial, we will step-by-step build a model to forecast daily electricity consumption in France for the next day (D+1). The goal is not only to achieve high performance but also to deeply understand the underlying mechanisms that make it possible.

The data used comes from RTE (Réseau de Transport d’Électricité), the operator of the French electricity transmission network. Through its Eco2mix platform, RTE provides detailed public data on electricity production, consumption, and flows. This data is freely accessible, notably in the form of downloadable files containing high-frequency observations (every 15 minutes).

For our project, we will use the annual files from Eco2mix – Annuel Définitif, covering the 2012–2024 period. These files include a power consumption column sampled every 15 minutes and expressed in MW. From this data, we reconstruct a daily series by aggregating the 96 quarter-hourly observations of each day, multiplying each power value by 0.25 hours to obtain the energy expressed in MWh.

You can access the RTE data download page here.

We have developed a Python script to consolidate the data over the 2012–2024 period: scripts/rte_with_preds.py The file resulting from the execution of this script is: data/rte_daily_consumption_2012_2024.csv

Our main focus is on the consumption_MWh column, which contains the recorded daily consumption for the 12 years under study. The analysis covers this entire period using a strict time-series split:

  • Training set: 2012 → 2021
  • Final test set: 2022 → 2024

This chronological split ensures that the model is evaluated on data it has never seen during training, thereby simulating real-world production conditions. It also helps detect potential data leakage or overfitting issues that might not surface with standard cross-validation. Finally, testing on a recent period (2022–2024) ensures that the measured performance reflects how the model handles the most current data dynamics.