Supervisor
Taufique Ahmed
Programme
MSc in Data Analytics
Subject
Computer Science
Abstract
This study investigates the use of machine learning regression models to impute missing micronutrient values in Food Composition Databases (FCDBs), focusing on the FAO/INFOODS dataset. A cascading prediction methodology leverages nutrient interdependencies to systematically estimate missing values. Four models—Random Forest (RF), Support Vector Machine (SVM), Gradient Boosting Machines (GBM), and Deep Neural Networks (DNN)—were evaluated using MAE, MSE, RMSE, and R². RF and GBM achieved the highest predictive accuracy for protein, phosphorus, calcium, and magnesium, demonstrating that ML-based predictive analytics can provide a more reliable alternative to traditional imputation methods. These findings support improved dietary assessments, nutritional research, and data-driven decision-making.
Date of Award
2025
Full Publication Date
2025
Access Rights
open access
Document Type
Capstone Project
Resource Type
thesis
Recommended Citation
Arenhart, C.
(2025) Improving the Completeness of Food Composition Databases Using Predictive Analysis. CCT College Dublin.
DOI: https://doi.org/10.63227/652.299.109