Project management of NTIS P1 Cybernetic Systems and Department of Cybernetics | WiKKY

Project

General

Profile

Bibliografické údaje

Název Generalized Non-Uniform Time Scaling Distribution Method for Natural-Sounding Speech Rate Change
Autor Tihelka, D., Méner, M.
Typ publikace Článek v časopise, odborném periodiku
Periodikum Lecture Notes in Computer Science: Text, Speech and Dialogue
Nakladatel Springer / Berlin, Heidelberg
Svazek LNAI 6836
Strana 147-154
Rok 2011
ISBN 978-3-642-23537-5
ISSN 0302-9743

Detail, PDF

Abstrakt

The paper proposes a general, flexible and efficient method for the distribution of time-scale modification Factors. The method is inspired by the analogy with a sequence of springs with different rates/constants, allowing simple and straightforward non-linear distribution of modification factors through the speech to modify. The flexibility and generality of the proposed scheme enables its use for any number of speech/sound segment categories of any type and length, while the modification factors can either be set heuristically, ad-hoc, or trained from data. At the end of the paper, an attempt to use statistics of phone durations to set the modification factors is described and discussed.