Feature importance analysis for enhanced interpretability of spectrophotometric Machine Learning (ML) models in water quality monitoring
محفوظ في:
| المؤلفون: | , |
|---|---|
| التنسيق: | artículo original |
| الحالة: | Versión publicada |
| تاريخ النشر: | 2026 |
| الوصف: | Ultraviolet-visible (UV-Vis) spectrophotometry for real-time NO3- quantification in water is commonly affected by spectral interferences from Dissolved Organic Matter (DOM). This study evaluates the use of machine learning (ML) models for this task, using feature importance analysis as a method to enhance chemical interpretability and detect spectral interferences. Four algorithms were compared using a dataset of 29 surface water samples: PCA-Random Forest (PCA-RF), PCA-XGBoost, full-spectrum RF (All-RF), and full-spectrum XGBoost (All-XGB). Leave-one-out cross-validation (LOOCV) showed no significant performance differences among the models (p = 0.182), with mean RMSE values between 0.6 and 0.8 mg / L. Nonetheless, feature importance analysis revealed that PCA-based models depend on variance rather than chemical relevance, which limits their reliability. The full-spectrum XGBoost model demonstrated superior spectral interpretability, successfully identifying both the NO3- absorption peak (≈ 220 nm) and the DOM interference correction peak (≈ 260 nm). This suggests that XGBoost could be advantageous for continuous water monitoring systems due to its ability to identify spectral interferences. |
| البلد: | Portal de Revistas TEC |
| المؤسسة: | Instituto Tecnológico de Costa Rica |
| Repositorio: | Portal de Revistas TEC |
| اللغة: | Español |
| OAI Identifier: | oai:ojs.pkp.sfu.ca:article/8521 |
| الوصول للمادة أونلاين: | https://revistas.tec.ac.cr/index.php/tec_marcha/article/view/8521 |
| كلمة مفتاحية: | UV-Vis spectroscopy nitrate water spectral interference Random Forest XGBoost Espectroscopía UV-Vis nitrato agua interferencia espectral XGBoots |