Feature importance analysis for enhanced interpretability of spectrophotometric Machine Learning (ML) models in water quality monitoring

 

Na minha lista:
Detalhes bibliográficos
Autores: Hernández-Alpízar, Laura, Gómez-Mejía, José Andrés
Formato: artículo original
Estado:Versión publicada
Fecha de Publicación:2026
Descrição:Ultraviolet-visible (UV-Vis) spectrophotometry for real-time NO3- quantification in water is commonly affected by spectral interferences from Dissolved Organic Matter (DOM). This study evaluates the use of machine learning (ML) models for this task, using feature importance analysis as a method to enhance chemical interpretability and detect spectral interferences. Four algorithms were compared using a dataset of 29 surface water samples: PCA-Random Forest (PCA-RF), PCA-XGBoost, full-spectrum RF (All-RF), and full-spectrum XGBoost (All-XGB). Leave-one-out cross-validation (LOOCV) showed no significant performance differences among the models (p = 0.182), with mean RMSE values between 0.6 and 0.8 mg / L. Nonetheless, feature importance analysis revealed that PCA-based models depend on variance rather than chemical relevance, which limits their reliability. The full-spectrum XGBoost model demonstrated superior spectral interpretability, successfully identifying both the NO3- absorption peak (≈ 220 nm) and the DOM interference correction peak (≈ 260 nm). This suggests that XGBoost could be advantageous for continuous water monitoring systems due to its ability to identify spectral interferences.
País:Portal de Revistas TEC
Recursos:Instituto Tecnológico de Costa Rica
Repositorio:Portal de Revistas TEC
Idioma:Español
OAI Identifier:oai:ojs.pkp.sfu.ca:article/8521
Acesso em linha:https://revistas.tec.ac.cr/index.php/tec_marcha/article/view/8521
Palavra-chave:UV-Vis spectroscopy
nitrate
water
spectral interference
Random Forest
XGBoost
Espectroscopía UV-Vis
nitrato
agua
interferencia espectral
XGBoots