HMM-Based Speech Synthesis Enhancement with Hybrid Postfilters

 

Guardado en:
Detalles Bibliográficos
Autores: Coto Jiménez, Marvin, Goddard Close, John
Formato: capítulo de libro
Fecha de Publicación:2018
Descripción:In this chapter, we introduce hybrid postfilters into speech synthesis, with the objective of enhancing the quality of the synthesized speech. Our approach combines a Wiener filter with deep neural networks. Several attempts to enhance synthetic speech have contemplated single-stage deep-learning-based postfilters, which learn to perform a mapping of the synthetic speech parameters to the natural ones. In the synthetic speech produced by statistical methods, we have measured low-level noise components, so the common single-stage postfilters must achieve the reduction of that component, as well as the complex relationship between the parameters of the synthetic and the natural speech. That is why we consider a two-stage approach: In the first stage, the Wiener filter deals with the noise components of the synthetic speech. In the second stage, a set of multi-stream postfilters, which encompass a collection of autoencoders and auto-associative networks, deal with the relationship between the output of the Wiener filter and the natural speech. Results show that the hybrid approach succeeds in enhancing the synthetic speech in most cases compared to a single-stage approach.
País:Kérwá
Institución:Universidad de Costa Rica
Repositorio:Kérwá
Lenguaje:Inglés
OAI Identifier:oai:kerwa.ucr.ac.cr:10669/86347
Acceso en línea:https://hdl.handle.net/10669/86347
Palabra clave:Deep learning
HMM
LSTM
Speech synthesis