Paper published in a journal (Scientific congresses and symposiums)
sHTS: A Streaming Architecture for Statistical Parametric Speech Synthesis
Astrinaki, Maria; Babacan, Onur; D'alessandro, Nicolas et al.
2011
 

Files


Full Text
p3s_rt_hts.pdf
Author preprint (1.2 MB)
Request a copy

All documents in ORBi UMONS are protected by a user license.

Send to



Details



Keywords :
[en] HMM, speech synthesis, statistical parametric speech synthesis, real-time, performative, streaming
Abstract :
[en] In this paper, we present a prototype for real-time speech synthesis. Statistical parametric speech synthesis is a relatively new approach to speech synthesis. Hidden Markov model based speech synthesis, one of the techniques in this approach, has been demonstrated to be very effective in synthesizing high quality, natural and expressive speech. In contrast to unit selection techniques, HMM-based speech synthesis provides high flexibility as a speech production model, with a small database footprint. In this work we modified the publicly available HTS engine to establish a streaming architecture, called streaming-HTS or sHTS, which provides us with a basis for further research for a future fully real-time speech synthesis system. Quantitative evaluations of the system showed that the degradation of speech quality in sHTS is small with reference to HTS. These results were supported by subjective evaluation, which confirmed that HTS and sHTS can hardly be distinguished.
Disciplines :
Mathematics
Author, co-author :
Astrinaki, Maria ;  Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle
Babacan, Onur ;  Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle
D'alessandro, Nicolas 
Dutoit, Thierry ;  Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle
Language :
English
Title :
sHTS: A Streaming Architecture for Statistical Parametric Speech Synthesis
Publication date :
14 March 2011
Event name :
p3s - International Workshop on Performative Speech and Singing Synthesis
Event place :
Vancouver, Canada
Event date :
2011
Research unit :
F105 - Information, Signal et Intelligence artificielle
Research institute :
R450 - Institut NUMEDIART pour les Technologies des Arts Numériques
Available on ORBi UMONS :
since 12 June 2013

Statistics


Number of views
4 (0 by UMONS)
Number of downloads
0 (0 by UMONS)

Bibliography


Similar publications



Contact ORBi UMONS