Project management of NTIS P1 Cybernetic Systems and Department of Cybernetics | WiKKY

Project

General

Profile

Actions

Task #4131

closed

Task #3672: RA1d - Automatic cleaning of speech corpora

Task #3690: Annotation error detection

Task #3899: Submit a paper on anomaly-based annottaion errors detection (Jimp)

Task #4128: Final listening test based evaluation of annotation error detection

Select utterances containing units with detected annotation errors

Added by Matoušek Jindřich over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Start date:
03.01.2017
Due date:
11.01.2017
% Done:

0%

Estimated time:

Description

  1. Analyze logs of synthesized utterances and grep utterances that contain unit(s) with detected annotation error
  2. Sort the filtered utterances according to the number of units with detected error
  3. Select the following utterances:
    1. 20 utterances with the most units that contain an error
    2. 20 utterances with just one unit containing an error
    3. 20 utterances with something in between (depending on the result of logging)
  4. Store texts and waveforms of the selected utterances

Files

annot-errors.detected.with_stats.txt (13.9 KB) annot-errors.detected.with_stats.txt Tihelka Dan, 09.01.2017 13:37
annot-errors.detected.with_rels.txt (17.3 KB) annot-errors.detected.with_rels.txt Tihelka Dan, 10.01.2017 14:06
least_frequent.txt (1000 Bytes) least_frequent.txt Matoušek Jindřich, 10.01.2017 15:26
mean_frequent.txt (954 Bytes) mean_frequent.txt Matoušek Jindřich, 10.01.2017 15:26
most_frequent.txt (1015 Bytes) most_frequent.txt Matoušek Jindřich, 10.01.2017 15:26

Related issues

Blocked by HQSYN16 - Task #4130: Prepare words with detected annotation errorsClosedMatoušek Jindřich03.01.201706.01.2017

Actions
Blocked by HQSYN16 - Task #4129: Synthesize & log a large portion of text by TTS system with annotation errorsClosedTihelka Dan03.01.201706.01.2017

Actions
Blocks HQSYN16 - Task #4132: Synthesize the selected utterances by TTS system with/without the annotation errorsClosedTihelka Dan03.01.201713.01.2017

Actions
Actions #1

Updated by Matoušek Jindřich over 7 years ago

  • Blocked by Task #4130: Prepare words with detected annotation errors added
Actions #2

Updated by Matoušek Jindřich over 7 years ago

  • Blocked by Task #4129: Synthesize & log a large portion of text by TTS system with annotation errors added
Actions #3

Updated by Matoušek Jindřich over 7 years ago

  • Blocks Task #4132: Synthesize the selected utterances by TTS system with/without the annotation errors added
Actions #4

Updated by Matoušek Jindřich over 7 years ago

Words detected as containing annotation errors (and being really misannotated) are attached here (#4130).

Actions #5

Updated by Tihelka Dan over 7 years ago

In annot-errors.detected.with_stats.txt, the words detected as containing annotation errors (in the file ...) were extended with the number representing how many units from the words were used during the synthesis of the large tests (see #4129). The list of words was sorted according to the number of selected units.

Actions #6

Updated by Matoušek Jindřich over 7 years ago

  • Status changed from Resolved to Feedback
  • Assignee changed from Matoušek Jindřich to Tihelka Dan

The absolute numbers of units are fine but it might be better to specify also the average number of units (containing annotation errors) per synthetic phrase.

Actions #7

Updated by Tihelka Dan over 7 years ago

Attachment annot-errors.detected.with_rels.txt contains the similar statistics as annot-errors.detected.with_stats.txt, with the difference that the first number represents W/(P +1), where:
  • W is the number of selections from the given word and
  • P is the number of phrases it was used in.

The second column in annot-errors.detected.with_rels.txt corresponds to the first column in annot-errors.detected.with_stats.txt.

Actions #8

Updated by Tihelka Dan over 7 years ago

  • Status changed from Feedback to Resolved
  • Assignee changed from Tihelka Dan to Matoušek Jindřich
Actions #9

Updated by Matoušek Jindřich over 7 years ago

The following items selected:
  • 20 "most frequent" items (a combination of the most frequently selected units in absolute and relative numbers) -- most_frequent.txt
  • 20 "least frequent" items (a combination of the least frequently selected units in absolute and relative numbers) -- least_frequent.txt
  • 20 "mean frequent" items (a combination of the moderately frequently selected units in absolute and relative numbers) -- mean_frequent.txt
Actions

Also available in: Atom PDF