Projects of Department of Cybernetics & NTIS P1 - Cybernetic Systems, University of West Bohemia:...https://wikky.zcu.cz/redmine/https://wikky.zcu.cz/redmine/redmine/favicon.ico?16338348402017-09-21T06:32:15ZProjects of Department of Cybernetics & NTIS P1 - Cybernetic Systems, University of West Bohemia
Redmine HQSYN16 - Task #4250 (New): F0 join costhttps://wikky.zcu.cz/redmine/issues/42502017-09-21T06:32:15ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p><em>Parent task for experiments with F0 join cost computation</em></p> HQSYN16 - Task #4248 (New): Data-based context penalization matrixhttps://wikky.zcu.cz/redmine/issues/42482017-09-21T06:13:04ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Propose data-based context penalization matrix.</p>
<p><strong>The idea:</strong><br />Use a similarity matrix in the range <0; 1> instead of the phonetic context of type 0/1.</p>
<p>The key is to propose a function which will automatically return a similarity number (0 = identical, 1 = dissimilar) of two diphones/phones. The function should be trained from data.</p> HQSYN16 - Task #4237 (New): Continuity of F0 patternhttps://wikky.zcu.cz/redmine/issues/42372017-09-13T15:40:43ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p><em>Parent task for the continuity of the F0 pattern within the synthesized utterance</em></p> HQSYN16 - Task #4214 (New): Redefine cost computation scheme in a language independent wayhttps://wikky.zcu.cz/redmine/issues/42142017-06-02T10:02:31ZTihelka Dan
<p>Although the experiments described on <a class="wiki-page" href="https://wikky.zcu.cz/redmine/projects/hqsyn16/wiki/Positional_features">positional features experiments</a> display significant improvement of the last syllable placement, they are still related <em>only</em> to the position in the last syllable, which was designed with Czech in mind.</p>
<p>The aim of this task is to design a new, language independent scheme which could be used for all ARTIC voices and languages.</p>
<p>The key idea is to define a set of <em>significant positions</em> in a prosodic word (or any other rhythm unit). The position cost is then related to those significant positions. They may be stress or a last syllable nucleus in Czech, but any other feature in other languages. The position may also wary for individual prosodic words (e.g. where stress moves).</p>
<p>The proposed scheme is as follows:</p>
<ul>
<li>each candidate unit defines its relative position withing the prosodic word <em>p(u)</em></li>
<li>each target unit defines its relative position withing the prosodic word <em>p(t)</em></li>
<li>there is set of <em>n</em> significant point positions for the given prosodic word in target <em>s(t,1), s(t,2), ..., s(t,n)</em>, each assigned with a weight <em>w(1), w(2), ..., w(n)</em></li>
<li>also, each unit has its relation to the significant points in their corresponding prosodic word <em>s(u,1), s(u,2), ..., s(u,n)</em></li>
</ul>
the cost for <em>i</em>-th significant points is the given by a difference in distances from the point:
<ul>
<li><em>vt(i)</em> = abs(<em>p(t)</em> - <em>s(t,i)</em>)</li>
<li><em>vc(i)</em> = abs(<em>p(u)</em> - <em>s(u,i)</em>)</li>
<li>cost(i) = abs(<em>vt(i)</em> - <em>vc(i)</em>) / min(<em>vt(i)</em>, <em>vc(i)</em>) + 1)</li>
<li>and the total position cost is the sum through all the <em>i</em> significant points</li>
</ul>
<p><em>Note:</em> thinking about it, the position features do not have to be shared across all the languages, i.e. each language/voice can have its own computation scheme. It just requires a more complex code handling the individual cases.</p> HQSYN16 - Task #4213 (Assigned): Tweak current position parameters computationhttps://wikky.zcu.cz/redmine/issues/42132017-06-02T08:39:51ZTihelka Dan
<p>Since the experiment with syllable-based experiments with positional features, described on <a class="wiki-page" href="https://wikky.zcu.cz/redmine/projects/hqsyn16/wiki/Positional_features">wiki</a>, show quite good results, it can be used as the baseline for ARTIC features modification. The aim is that instead of re-implementing positional features (which will <strong>require data image changes</strong>), we will tweak the current computation scheme (using the current set of features). In this way, we can achieve fast improvement (not perfect, though!) with low cost of coding.</p>
The original cost computation is
<ul>
<li>pos_cost = <em>beg(w)</em> * abs(<em>beg(t) - beg(u)</em>) + <em>mid(w)</em> * abs(<em>mid(t) - mid(u)</em>) + <em>end(w)</em> * abs(<em>end(t) - end(u)</em>)
<ul>
<li><em>weight</em> = 7 for all positions</li>
</ul></li>
</ul>
The first tweaked versions are:
<ul>
<li>pos_cost_1 = pos_cost + 150 * abs(<em>end(t) - end(u)</em>)
<ul>
<li><em>weight</em> = 7</li>
</ul>
</li>
<li>pos_cost_2 = pos_cost + 999 * <em>end(u)</em> * abs(<em>end(t) - mid(u)</em>)
<ul>
<li><em>weight</em> = 9</li>
<li>for match on both unit and target being prosodic word transitional, the tweaked addition is set to 0</li>
</ul>
</li>
<li>pos_cost_3 = pos_cost + 999 * <em>end(u)</em> * abs(<em>end(t) - end(u)</em>)
<ul>
<li><em>weight</em> = 9</li>
</ul></li>
</ul>
where:
<ul>
<li><em>beg(u)</em>, <em>mid(u)</em> and <em>end(u)</em> are the position weights of the candidate unit <em>u</em> related to the beginning. middle and end of its prosodic word</li>
<li><em>beg(t)</em>, <em>mid(t)</em> and <em>end(t)</em> are the position weights of the target unit <em>t</em> related to the beginning. middle and end of its prosodic word</li>
<li><em>beg(w)</em>, <em>mid(w)</em> and <em>end(w)</em> are the corresponding weights, unit and target independent</li>
</ul>
<strong>Results:</strong>
<ul>
<li>pos_cost_3: lowers the number of position failures from 134k to approx. 11k (<strong>winner</strong>).</li>
<li>pos_cost_2: still displays more than 70k position failures.</li>
<li>pos_cost_1: lowers the number of position failures from 134k to approx. 17k.</li>
</ul>
<p><strong>The key tak is how to further improve</strong> pos_cost_3 <strong>scheme</strong>.</p> HQSYN16 - Task #4159 (New): Revision of continuous positional parametershttps://wikky.zcu.cz/redmine/issues/41592017-02-07T21:58:16ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Revise the calculation of continuous positional parameters</p> HQSYN16 - Task #4147 (New): "objednala" sounds like "objednal"https://wikky.zcu.cz/redmine/issues/41472017-01-12T12:19:14ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p><strong>Voice:</strong> Jan (for ScreenReaders)</p>
<p><strong>Description:</strong>:</p>
<ul>
<li>Word <em>objednala</em> sounds like <em>objednal</em> in the sentence "Žena si objednala psa z Berlína".</li>
</ul> HQSYN16 - Task #3880 (New): "válel" sounds like "válil"https://wikky.zcu.cz/redmine/issues/38802016-04-27T08:48:52ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p><strong>Voice:</strong> Stanislav</p>
<strong>Description:</strong>
<ul>
<li>Word <em>válel</em> sounds like <em>vál<strong>i</strong>l</em> in the sentence <em>"Válel jsem se"</em>.</li>
<li>The phone <code>[e]</code> was made up from two parts - the left half from a word <em>válečníkem</em> and the right part from a word <em>přelstít</em> (use WebProkus for more details).</li>
<li>Listening to the words in the source utterances, in both cases I hear rather <code>[e]</code>.</li>
</ul> HQSYN16 - Task #3811 (New): Experiment with statistical outlier detection and removalhttps://wikky.zcu.cz/redmine/issues/38112016-03-09T14:39:55ZMatoušek Jindřichjmatouse@kky.zcu.czHQSYN16 - Task #3797 (New): Artifact cataloguehttps://wikky.zcu.cz/redmine/issues/37972016-03-09T10:36:44ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p><em>(Parent task for cataloguing of artifacts)</em></p> HQSYN16 - Task #3704 (New): Detection and correction of prosodic structureshttps://wikky.zcu.cz/redmine/issues/37042016-01-19T19:56:18ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Parent task for both detection and correction of errors in prosodic structure marking</p> HQSYN16 - Task #3697 (Assigned): Electromagnetic articulography (EMA) based researchhttps://wikky.zcu.cz/redmine/issues/36972016-01-15T11:34:30ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>"Parent" issue for EMA-based research</p> HQSYN16 - Task #3683 (New): RA4d - Compromise between selected and generated speechhttps://wikky.zcu.cz/redmine/issues/36832016-01-12T15:56:02ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Parent task for combining/compromising between selected and generated speech</p>
<p><strong>Estimated time schedule:</strong> 01/2018 - 12/2018</p> HQSYN16 - Task #3682 (New): RA4c - Hybrid approacheshttps://wikky.zcu.cz/redmine/issues/36822016-01-12T15:55:26ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Parent task for hybrid approaches to speech synthesis</p>
<p><strong>Estimated time schedule:</strong> 01/2017 - 12/2018</p> HQSYN16 - Task #3681 (New): RA4b - Dedicated signal modificationhttps://wikky.zcu.cz/redmine/issues/36812016-01-12T15:54:44ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Parent task for dedicated signal modification of synthetic speech with detected errors</p>
<p><strong>Estimated time schedule:</strong> 01/2017 - 12/2018</p> HQSYN16 - Task #3680 (New): RA4a - Automatic error predictionhttps://wikky.zcu.cz/redmine/issues/36802016-01-12T15:54:11ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Parent task for automatic error prediction in synthetic speech</p>
<p><strong>Estimated time schedule:</strong> 01/2016 - 12/2017</p> HQSYN16 - Task #3679 (New): RA3d - Revision of positional parameters and weightinghttps://wikky.zcu.cz/redmine/issues/36792016-01-12T15:53:16ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Parent task for revision of positional parameters and weighting</p>
<p><strong>Estimated time schedule:</strong> 01/2018 - 12/2018</p> HQSYN16 - Task #3678 (New): RA3c - Continuity of prosodic patternshttps://wikky.zcu.cz/redmine/issues/36782016-01-12T15:49:20ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Parent task for research in of continuity of prosodic patterns</p>
<p><strong>Estimated time schedule:</strong> 01/2017 - 12/2018</p> HQSYN16 - Task #3677 (New): RA3b - Phonetically justified parameters (spectral tilt, ...)https://wikky.zcu.cz/redmine/issues/36772016-01-12T15:48:25ZMatoušek Jindřichjmatouse@kky.zcu.cz
<p>Parent task for research of phonetically justified parameters (spectral tilt, ...) in the context of speech synthesis</p>
<p><strong>Estimated time schedule:</strong> 01/2017 - 12/2018</p> HQSYN16 - Task #3676 (New): RA3a - Context definition and penalisation matrixhttps://wikky.zcu.cz/redmine/issues/36762016-01-12T15:47:41ZMatoušek Jindřichjmatouse@kky.zcu.cz