Supervisor

Dr Matt Lemon

Programme

MSc in Data Analytics

Subject

Computer Science

Abstract

Abstract

This study investigates the application of statistical, machine learning, and deep learning methods to detect monthly weather anomalies in Ireland between 1960-2024. Climate variability is intensifying globally, increasing the urgency for accurate detection of unusual weather events. Using publicly available data from Met Eireann provided by the CSO PxStat Open Data Portal, the research applied comprehensive preprocessing including Bayesian Ridge iterative imputation, temporal and seasonal feature engineering, and ensemble statistical anomaly labelling based on z-score, interquartile range, and rolling residual analysis. Four models were developed and evaluated: Isolation Forest, XGBoost, Long Short Term Memory networks, and Gated Recurrent Units, each with baseline, hyperparameter tuned and other variants tested.

Results show that the hyperparameter tuned XGBoost model achieved the best overall performance with an F1-score of 0.9160, precision of 0.9524, recall of 0.8824 and accuracy of 0.9922 while class weighted XGBoost improved recall to 0.9265 at the expense of precision which dropped to 0.5833. Isolation Forest models prioritised recall (up to 0.7941) but suffered from low precision. LSTM and GRU models achieved perfect recall when tuned and class weighted, but with extremely low precision, limiting their operational usefulness as they predicted nearly all cases as anomalous. SHAP analysis of XGBoost highlighted precipitation, sunshine to precipitation ratio, and wind related metrics as key anomaly drivers.

The findings demonstrate that tree based supervised models offered the best trade-off between accuracy, precision, and interpretability for Irish monthly anomaly detection, while recurrent DL models may require further optimisation or hybrid strategies to balance recall and precision. The study contributes a reproducible pipeline for environmental anomaly detection, relevant for climate adaptation planning and operational monitoring systems.

Date of Award

2025

Full Publication Date

2025

Access Rights

open access

Document Type

Capstone Project

Resource Type

thesis

Included in

Data Science Commons

Share

COinS