Rich Transcription Time Marked (RTTM) files are space-delimited text files containing one turn per line, each line containing ten fields:
- Type — segment type; should always by SPEAKER
- File ID — file name; basename of the recording minus extension (e.g., rec1_a)
- Channel ID — channel (1-indexed) that turn is on; should always be 1
- Turn Onset — onset of turn in seconds from beginning of recording
- Turn Duration — duration of turn in seconds
- Orthography Field — should always by < NA >
- Speaker Type — should always be < NA >
- Speaker Name — name of speaker of turn; should be unique within scope of each file
- Confidence Score — system confidence (probability) that information is correct; should always be < NA >
- Signal Lookahead Time — should always be < NA >
SPEAKER CMU_20020319-1400_d01_NONE 1 130.430000 2.350 <NA> <NA> juliet <NA> <NA>
SPEAKER CMU_20020319-1400_d01_NONE 1 157.610000 3.060 <NA> <NA> tbc <NA> <NA>
SPEAKER CMU_20020319-1400_d01_NONE 1 130.490000 0.450 <NA> <NA> chek <NA> <NA>
To write rttm file:
with open(rttmf, 'wb') as f:
for turn in turns:
fields = ['SPEAKER', turn.file_id, '1', format_float(turn.onset, n_digits), format_float(turn.dur, n_digits),
'<NA>', '<NA>', turn.speaker_id, '<NA>', '<NA>']
line = ' '.join(fields)
f.write(line.encode('utf-8'))
f.write(b'\n')
reference urls: https://github.com/nryant/dscore https://github.com/nryant/dscore/blob/824f126ae9e78cf889e582eec07941ffe3a7d134/scorelib/rttm.py#L103