Supervisor
Kislay Raj
Programme
MSc in Data Analytics
Abstract
This study focuses on predicting electricity consumption through data analytics and ensemble learning methods, addressing fluctuations influenced by external economic factors. Techniques like Gradient Boosting Regressor (GBR) and Random Forest Regressor (RFR) proved effective due to their ability to generalise well with new data. CRISP-DM served as the guiding methodology, supported by robust preprocessing techniques such as winsorisation to handle outliers, feature selection to refine variables, and scaling to standardise data for improved model performance.
The research involved datasets from non-residential clients and data centres, uncovering consumption patterns through visualisations in Tableau. Analysis showed that County Dublin and Kildare were among the highest electricity consumers in Leinster from 2015 to 2022. Advanced feature selection improved model accuracy by removing variables with low correlation to the target, while preprocessing steps like one-hot encoding and data scaling ensured optimal input for regression models.
Results highlighted the predictive strength of ensemble methods, with GBR and RFR achieving high R² scores, low RMSE, and robust cross-validation performance. GBR particularly excelled with strong reliability across data subsets and balanced training and testing accuracy. While the study reinforced existing insights into ensemble learning's capabilities, it demonstrated the practicality of these models for handling tabular data and extracting actionable findings from limited datasets.
Date of Award
2024
Full Publication Date
2024
Access Rights
open access
Document Type
Capstone Project
Resource Type
thesis
Recommended Citation
Dominguez Alvarenga, Maria, "Application of Machine Learning algorithms to evaluate the changes in energy consumption in the Leinster area and subsequently the impact on consumer behaviour in the commercial sector." (2024). ICT. 53.
https://arc.cct.ie/ict/53