Paper published in a journal (Scientific congresses and symposiums)
HMM-based Speech Synthesis of Live Sports Commentaries: Integration of a Two-Layer Prosody Annotation
Picart, Benjamin; Brognaux, Sandrine; Drugman, Thomas
2013
 

Files


Full Text
ssw8_bpsbtd.pdf
Author postprint (480.23 kB)
Request a copy

All documents in ORBi UMONS are protected by a user license.

Send to



Details



Keywords :
[en] Speaking Style Adaptation; [en] Expressive Speech; [en] Prosody; [en] HMM-based Speech Synthesis; [en] Sports Commentaries
Abstract :
[en] This paper proposes the integration of a two-layer prosody annotation specific to live sports commentaries into HMM-based speech synthesis. Local labels are assigned to all syllables and refer to accentual phenomena. Global labels categorize sequences of words into five distinct speaking styles, defined in terms of valence and arousal. Two stages of the synthesis process are analyzed. First, the integration of global labels (i.e. speaking styles) is carried out either using speaker-dependent training or adaptation methods. Secondly, a comprehensive study allows evaluating the effects achieved by each prosody annotation layer on the generated speech. The evaluation process is based on three subjective criteria: intelligibility, expressivity and segmental quality. Our experiments indicate that: (i) for the integration of global labels, adaptation techniques outperform speaking style-dependent models both in terms of intelligibility and segmental quality; (ii) the integration of local labels results in an enhanced expressivity, while it provides slightly higher intelligibility and segmental quality performance; (iii) combining the two levels of annotation (local and global) leads to the best results. It is indeed shown that it obtains better levels of expressivity and intelligibility.
Disciplines :
Electrical & electronics engineering
Author, co-author :
Picart, Benjamin ;  Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle
Brognaux, Sandrine 
Drugman, Thomas ;  Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle
Language :
English
Title :
HMM-based Speech Synthesis of Live Sports Commentaries: Integration of a Two-Layer Prosody Annotation
Publication date :
02 July 2013
Event name :
8th Speech Synthesis Workshop (SSW8)
Event place :
Barcelona, Spain
Event date :
2013
Research unit :
F105 - Information, Signal et Intelligence artificielle
Research institute :
R450 - Institut NUMEDIART pour les Technologies des Arts Numériques
Available on ORBi UMONS :
since 23 January 2014

Statistics


Number of views
16 (0 by UMONS)
Number of downloads
0 (0 by UMONS)

Scopus citations®
 
6
Scopus citations®
without self-citations
1

Bibliography


Similar publications



Contact ORBi UMONS