sHTS: A Streaming Architecture for Statistical Parametric Speech Synthesis

Astrinaki, Maria; Babacan, Onur; D'alessandro, Nicolas; Dutoit, Thierry

Request a copy

Paper published in a journal (Scientific congresses and symposiums)

sHTS: A Streaming Architecture for Statistical Parametric Speech Synthesis

Astrinaki, Maria; Babacan, Onur; D'alessandro, Nicolas et al.

2011

Permalink
https://hdl.handle.net/20.500.12907/41508

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

p3s_rt_hts.pdf

Author preprint (1.2 MB)

Request a copy

All documents in ORBi UMONS are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

[en] HMM, speech synthesis, statistical parametric speech synthesis, real-time, performative, streaming

Abstract :

[en] In this paper, we present a prototype for real-time speech synthesis. Statistical parametric speech synthesis is a relatively new approach to speech synthesis. Hidden Markov model based speech synthesis, one of the techniques in this approach, has been demonstrated to be very effective in synthesizing high quality, natural and expressive speech. In contrast to unit selection techniques, HMM-based speech synthesis provides high flexibility as a speech production model, with a small database footprint. In this work we modified the publicly available HTS engine to establish a streaming architecture, called streaming-HTS or sHTS, which provides us with a basis for further research for a future fully real-time speech synthesis system. Quantitative evaluations of the system showed that the degradation of speech quality in sHTS is small with reference to HTS. These results were supported by subjective evaluation, which confirmed that HTS and sHTS can hardly be distinguished.

Disciplines :

Mathematics

Author, co-author :

Astrinaki, Maria ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle

Babacan, Onur ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle

D'alessandro, Nicolas

Dutoit, Thierry ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle

Language :

English

Title :

sHTS: A Streaming Architecture for Statistical Parametric Speech Synthesis

Publication date :

14 March 2011

Event name :

p3s - International Workshop on Performative Speech and Singing Synthesis

Event place :

Vancouver, Canada

Event date :

2011

Research unit :

F105 - Information, Signal et Intelligence artificielle

Research institute :

R450 - Institut NUMEDIART pour les Technologies des Arts Numériques

Available on ORBi UMONS :

since 12 June 2013

Statistics

Number of views

57 (0 by UMONS)

Number of downloads

0 (0 by UMONS)

More statistics