RTTM 文件格式介绍 | 一见知得 | Mr J Blog

RTTM 文件格式介绍

Mr J 1645 0

Rich Transcription Time Marked (RTTM) files are space-delimited text files containing one turn per line, each line containing ten fields:

  • Type — segment type; should always by SPEAKER
  • File ID — file name; basename of the recording minus extension (e.g., rec1_a)
  • Channel ID — channel (1-indexed) that turn is on; should always be 1
  • Turn Onset — onset of turn in seconds from beginning of recording
  • Turn Duration — duration of turn in seconds
  • Orthography Field — should always by < NA >
  • Speaker Type — should always be < NA >
  • Speaker Name — name of speaker of turn; should be unique within scope of each file
  • Confidence Score — system confidence (probability) that information is correct; should always be < NA >
  • Signal Lookahead Time — should always be < NA >
SPEAKER CMU_20020319-1400_d01_NONE 1 130.430000 2.350 <NA> <NA> juliet <NA> <NA>
SPEAKER CMU_20020319-1400_d01_NONE 1 157.610000 3.060 <NA> <NA> tbc <NA> <NA>
SPEAKER CMU_20020319-1400_d01_NONE 1 130.490000 0.450 <NA> <NA> chek <NA> <NA>

To write rttm file:

with open(rttmf, 'wb') as f:
    for turn in turns:
        fields = ['SPEAKER', turn.file_id, '1', format_float(turn.onset, n_digits), format_float(turn.dur, n_digits),
              '<NA>', '<NA>', turn.speaker_id, '<NA>', '<NA>']
        line = ' '.join(fields)

reference urls: https://github.com/nryant/dscore https://github.com/nryant/dscore/blob/824f126ae9e78cf889e582eec07941ffe3a7d134/scorelib/rttm.py#L103

表情 图片 链接 代码