Supervisor

Taufique Ahmed

Programme

MSc in Data Analytics

Subject

Computer Science

Abstract

This study investigates the use of machine learning regression models to impute missing micronutrient values in Food Composition Databases (FCDBs), focusing on the FAO/INFOODS dataset. A cascading prediction methodology leverages nutrient interdependencies to systematically estimate missing values. Four models—Random Forest (RF), Support Vector Machine (SVM), Gradient Boosting Machines (GBM), and Deep Neural Networks (DNN)—were evaluated using MAE, MSE, RMSE, and R². RF and GBM achieved the highest predictive accuracy for protein, phosphorus, calcium, and magnesium, demonstrating that ML-based predictive analytics can provide a more reliable alternative to traditional imputation methods. These findings support improved dietary assessments, nutritional research, and data-driven decision-making.

Date of Award

2025

Full Publication Date

2025

Access Rights

open access

Document Type

Capstone Project

Resource Type

thesis

Included in

Data Science Commons

Share

COinS