[en] In this paper, we present a prototype for real-time speech synthesis. Statistical parametric speech synthesis is a relatively new approach to speech synthesis. Hidden Markov model based speech synthesis, one of the techniques in this approach, has been demonstrated to be very effective in synthesizing high quality, natural and expressive speech. In contrast to unit selection techniques, HMM-based speech synthesis provides high flexibility as a speech production model, with a small database footprint. In this work we modified the publicly available HTS engine to establish a streaming architecture, called streaming-HTS or sHTS, which provides us with a basis for further research for a future fully real-time speech synthesis system. Quantitative evaluations of the system showed that the degradation of speech quality in sHTS is small with reference to HTS. These results were supported by subjective evaluation, which confirmed that HTS and sHTS can hardly be distinguished.
Disciplines :
Mathematics
Author, co-author :
Astrinaki, Maria ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle
Babacan, Onur ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle