Measuring the Quality of Low-Resourced Statistical Parametric Speech Synthesis Trained with Noise-Degraded Data Supported by the University of Costa Rica

 

Guardado en:
Detalles Bibliográficos
Autor: Coto Jiménez, Marvin
Formato: artículo original
Fecha de Publicación:2022
Descripción:After the successful implementation of speech synthesis in several languages, the study of robustness became an important topic so as to increase the possibility of building voices from non-standard sources, e.g. historical recordings, children's speech, and data freely available on the Internet. In this work, a measure of the influence of noise in the source speech of the statistical parametric speech synthesis system based on HMM is performed, for a case of a low-resourced database. For this purpose, three types of additive noise were considered at five signal-to-noise ratio levels to affect the source speech data. Using objective measures to assess the perceptual quality of the results and the propagation of the noise through all the processes of building speech synthesis, the results show a severe drop in the quality of artificial speech, even for the cases of lower levels of noise. Such degradation seems to be independent of the noise type, and is at lower proportion to the noise level. This results are of importance for any practical implementation of speech synthesis from degraded data in similar conditions, and shows that applying denoising processes became mandatory in order to keep the possibility of building intelligible voices.
País:Kérwá
Institución:Universidad de Costa Rica
Repositorio:Kérwá
Lenguaje:Inglés
OAI Identifier:oai:kerwa.ucr.ac.cr:10669/87037
Acceso en línea:https://www.cys.cic.ipn.mx/ojs/index.php/CyS/article/view/4254
https://hdl.handle.net/10669/87037
Palabra clave:NOISE
Robustness
Speech synthesis