HMM-Based Speech Synthesis Enhancement with Hybrid Postfilters
Guardado en:
Autores: | , |
---|---|
Formato: | capítulo de libro |
Fecha de Publicación: | 2018 |
Descripción: | In this chapter, we introduce hybrid postfilters into speech synthesis, with the objective of enhancing the quality of the synthesized speech. Our approach combines a Wiener filter with deep neural networks. Several attempts to enhance synthetic speech have contemplated single-stage deep-learning-based postfilters, which learn to perform a mapping of the synthetic speech parameters to the natural ones. In the synthetic speech produced by statistical methods, we have measured low-level noise components, so the common single-stage postfilters must achieve the reduction of that component, as well as the complex relationship between the parameters of the synthetic and the natural speech. That is why we consider a two-stage approach: In the first stage, the Wiener filter deals with the noise components of the synthetic speech. In the second stage, a set of multi-stream postfilters, which encompass a collection of autoencoders and auto-associative networks, deal with the relationship between the output of the Wiener filter and the natural speech. Results show that the hybrid approach succeeds in enhancing the synthetic speech in most cases compared to a single-stage approach. |
País: | Kérwá |
Institución: | Universidad de Costa Rica |
Repositorio: | Kérwá |
Lenguaje: | Inglés |
OAI Identifier: | oai:kerwa.ucr.ac.cr:10669/86347 |
Acceso en línea: | https://hdl.handle.net/10669/86347 |
Palabra clave: | Deep learning HMM LSTM Speech synthesis |