Uncertainty estimation for a speech recognition system
محفوظ في:
| المؤلفون: | , |
|---|---|
| التنسيق: | artículo original |
| الحالة: | Versión publicada |
| تاريخ النشر: | 2024 |
| الوصف: | Whisper is a voice recognition system designed by the company OpenAI, which has been trained with 680,000 hours of multilingual and multitask supervised data collected from the web. The following research aims to adapt and employ the Monte Carlo Dropout using audio data labeled in Spanish and contaminated with a certain amount of noise and Levensthein distance to estimate the score uncertainty of this system.Preliminary results show that there is a linear relationship between uncertainty estimation and the Word Error Rate (WER) of the transcriptions. Furthermore, it is observed that the number of insertions or omissions in the transcriptions tends to be low. |
| البلد: | Portal de Revistas TEC |
| المؤسسة: | Instituto Tecnológico de Costa Rica |
| Repositorio: | Portal de Revistas TEC |
| اللغة: | Español |
| OAI Identifier: | oai:ojs.pkp.sfu.ca:article/7305 |
| الوصول للمادة أونلاين: | https://revistas.tec.ac.cr/index.php/tec_marcha/article/view/7305 |
| كلمة مفتاحية: | Uncertainty Speech Recognition ASR Whisper Monte Carlo Dropout Incertidumbre Reconocimiento de voz |