Bibliografické údaje¶
Název | Generalized Non-Uniform Time Scaling Distribution Method for Natural-Sounding Speech Rate Change |
Autor | Tihelka, D., Méner, M. |
Typ publikace | Článek v časopise, odborném periodiku |
Periodikum | Lecture Notes in Computer Science: Text, Speech and Dialogue |
Nakladatel | Springer / Berlin, Heidelberg |
Svazek | LNAI 6836 |
Strana | 147-154 |
Rok | 2011 |
ISBN | 978-3-642-23537-5 |
ISSN | 0302-9743 |
Abstrakt¶
The paper proposes a general, flexible and efficient method for the distribution of time-scale modification Factors. The method is inspired by the analogy with a sequence of springs with different rates/constants, allowing simple and straightforward non-linear distribution of modification factors through the speech to modify. The flexibility and generality of the proposed scheme enables its use for any number of speech/sound segment categories of any type and length, while the modification factors can either be set heuristically, ad-hoc, or trained from data. At the end of the paper, an attempt to use statistics of phone durations to set the modification factors is described and discussed.