← Glossary · Caption styling

Speaker label

Speaker label / speaker ID

A text prefix or color cue that identifies who is speaking. Used in interviews, podcasts, and SDH captions.

In depth

Speaker labels identify the speaker in multi-speaker captions. Conventions vary: text labels in brackets ([JOHN]) for SDH and broadcast, color-coding for short-form social, or initial letters at the start of each cue. Whisper and modern ASR support speaker diarization to assign labels automatically — accuracy is 85-95% on clean two-speaker audio. Skip speaker labels when there's only one speaker on screen or when the visual makes it obvious.

When to use it

Use speaker labels when there are multiple speakers and the audio alone (or pure visual) doesn't disambiguate. Required for SDH and accessibility-grade captions.

Frequently asked

Should speaker labels go in brackets or at the line start?+

Both are accepted. Netflix uses brackets ([JOHN]: Hello) for accessibility content. Short-form social often skips text labels entirely and uses color-coded captions per speaker.

Related terms
Skip the file-format gymnastics.
Drop a video into the SoCaptions editor — get ready-to-publish captions in any format.
Try free