[en] In this report, we present a Max/MSP external for real-time speech synthesis. Statistical parametric speech synthesis, based on Hid- den Markov Models has been demonstrated to be very effective in synthesizing high-quality, natural and expressive speech. This technique is also able to provide high flexibility as a speech production model and a small database footprint. In this work, we modify the existing HTS engine in order to establish a streaming architecture, called performative-HTS or pHTS. pHTS is implemented as a Max/MSP external which provides a basis for further research in gesturally-controlled speech synthesis. Quantitative evaluations of the system show that the degradation of speech quality in pHTS is small with reference to HTS. These results are supported by a subjective evaluation, which confirms that HTS and pHTS resulting speech waveforms can hardly be distinguished.
Disciplines :
Mathematics
Author, co-author :
Astrinaki, Maria ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle
Babacan, Onur ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle