Improving Balanced Accuracy for Minority Plant Species under Data Imbalance
Guardat en:
| Autors: | , |
|---|---|
| Format: | artículo original |
| Estat: | Versión publicada |
| Data de publicació: | 2024 |
| Descripció: | Regardless of the widely known success of deep learning in classification, such models are commonly measured by metrics that do not account for data imbalance, especially in terms of predictions per class, ignoring minority classes. This can be a problem, as minority classes are often the most difficult to predict and collect data for. In the plant domain, for example, species with fewer samples are often the ones that are hardest to collect and predict in the field. As we continue to identify more and more plant species, more of them become minority species, making it increasingly difficult to accurately classify them using traditional machine learning methods. To address this issue, we explore the combination of traditional data and machine learning approaches with deep learning techniques such as self-supervision in a preprocessing stage. By using self-supervised training together with different sampling algorithms and class weights, we were able to improve the balanced accuracy metric for minority plant species by between 7.9% and 13% without affecting general accuracy. This shows that using deep learning techniques in combination with traditional machine learning methods can help to improve the accuracy of predictions for minority classes, even in domains where data is limited. |
| Pais: | Portal de Revistas TEC |
| Institution: | Instituto Tecnológico de Costa Rica |
| Repositorio: | Portal de Revistas TEC |
| Idioma: | Inglés |
| OAI Identifier: | oai:ojs.pkp.sfu.ca:article/7293 |
| Accés en línia: | https://revistas.tec.ac.cr/index.php/tec_marcha/article/view/7293 |
| Paraula clau: | Imbalanced datasets long-tail distribution automatic plant identification balanced metrics deep learning minority classes classification. Conjuntos de datos desbalanceados distribución de cola larga identificación automática de plantas aprendizaje profundo clases minoritarias clasificación. |