Project management of NTIS P1 Cybernetic Systems and Department of Cybernetics | WiKKY

Project

General

Profile

Actions

Task #3709

closed

Task #3672: RA1d - Automatic cleaning of speech corpora

Task #3704: Detection and correction of prosodic structures

Merge ASF files (segmentations) and SNT files (annotations)

Added by Hanzlíček Zdeněk over 8 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
Normal
Start date:
05.04.2016
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)

Description

Merge ASF files (segmentations) and SNT files (annotations):
  1. Find differences between words in ASF and SNT and update SNT in SVN repository.
  2. Add new columns into ASF: punctuation and pronunciation.

Subtasks 1 (0 open1 closed)

Task #3851: Merge ASF and SNT files for MM voice (Slovak)ClosedHanzlíček Zdeněk05.04.2016

Actions

Related issues

Related to HQSYN16 - Task #3761: Create script for conversion ASF to SNTClosedHanzlíček Zdeněk23.02.2016

Actions
Actions #1

Updated by Hanzlíček Zdeněk over 8 years ago

Add missing non-speech events from SNT into ASF with zero duration.

Actions #2

Updated by Hanzlíček Zdeněk over 8 years ago

Use upper-case chars in words in ASF (copy from SNT).

Actions #3

Updated by Hanzlíček Zdeněk over 8 years ago

  • % Done changed from 0 to 70

Scripts for this task are placed in SVN repository ARTIC_UTILS/trunk/hmm_synth/LabLight.

A simple description of the merging procedure:
  1. Script diff_asf_snt.py performs a simple comparison between ASF and SNT files, prints out suspicious inconsistent utterances (pauses are ignored). This comparison reveals only some basic types of inconsistency.
  2. Manual correction of SNT file (when needed).
  3. Script merge_asf_snt.py merges ASF and SNT files, a new ASF file is created.
    1. Join verbs and enclictic "li" into one word (e.g. bude-li).
    2. Use words from SNT (with capital letters).
    3. Add punctuation and pronunciation columns.

Two ASF files were processed (voices MR and TJ). New ASF and SNT files are placed in ARTIC directory /artic/Experiments/asf.snt.merge.

Actions #4

Updated by Hanzlíček Zdeněk about 8 years ago

  • % Done changed from 70 to 80

Two more voices were processed: KI and AJ.
NOTE: Several specific corrections had to be done manually. Thus, the new ASFs should replace the default ASFs in the SVN repository. Otherwise, sooner or later, we could have two parallel inconsistent ASF versions.

Actions #5

Updated by Hanzlíček Zdeněk about 8 years ago

  • Related to Task #3761: Create script for conversion ASF to SNT added
Actions #6

Updated by Hanzlíček Zdeněk almost 8 years ago

  • Status changed from Assigned to Resolved

Summary

Voices wih merged ASFs:
  • Czech voices: AJ, JS, KI, MR, SK, TJ
  • Slovak voice: MM

ASF file for the Czech female voice PP was not merged, since the annotation file is not available.

Actions #7

Updated by Hanzlíček Zdeněk almost 8 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF