Project management of NTIS P1 Cybernetic Systems and Department of Cybernetics | WiKKY

Project

General

Profile

Task #4214

Updated by Tihelka Dan almost 7 years ago

Although the experiments described on [[Positional_features|positional features experiments]] display significant improvement of the last syllable placement, they are still related _only_ to the position in the last syllable, which was designed with Czech in mind. 

 The aim of this task is to design a new, language independent scheme which could be used for all ARTIC voices and languages. 

 The key idea is to define a set of _significant positions_ in a prosodic word (or any other rhythm unit). The position cost is then related to those significant positions. They may be stress or a last syllable nucleus in Czech, but any other feature in other languages. The position may also wary for individual prosodic words (e.g. where stress moves). 

 The proposed scheme is as follows: 

 * each candidate unit defines its relative position withing the prosodic word _p(u)_ 
 * each target unit defines its relative position withing the prosodic word _p(t)_ 
 * there is set of _n_ significant point positions for the given prosodic word in target _s(t,1), s(t,2), ..., s(t,n)_, each assigned with a weight _w(1), w(2), ..., w(n)_ 
 * also, each unit has its relation to the significant points in their corresponding prosodic word    _s(u,1), s(u,2), ..., s(u,n)_ 

 the cost for _i_-th significant points is the given by a difference in distances from the point: 
 * _vt(i)_ = abs(_p(t)_    - _s(t,i)_) 
 * _vc(i)_ = abs(_p(u)_    - _s(u,i)_) 
 * cost(i) = abs(_vt(i)_ - _vc(i)_) / min(_vt(i)_, _vc(i)_) + 1) 
 * and the total position cost is the sum through all the _i_ significant points 

 _Note:_ thinking about it, the position features do does not have to be shared across all the languages, i.e. each language/voice can have its own computation scheme. It just requires a more complex code handling the individual cases. 

Back