← Answers · Workflow

How long does it take to caption a video?

Short answer

AI captioning: 10–60 seconds for a 5–20 minute video. Manual captioning: 4–6 minutes per minute of video. Hybrid (AI + edit): 30 seconds plus 5–10% of video length.

Detail

Captioning time depends on the method. AI captioning with Whisper-class models takes roughly 5% of video runtime to transcribe — a 20-minute video takes about a minute. Manual captioning by hand takes 4–6 minutes per minute of video for clean speech, longer for accented or fast speech. The practical 2026 workflow is hybrid: let AI generate the first pass, then spend 5–10% of the video runtime hand-correcting proper nouns, technical terms, and any audio artifacts.

MethodTime per minute of video
AI captioning (Whisper)~3–5 seconds
AI + hand-edit~30–60 seconds
Manual transcription4–6 minutes
Professional service (Rev, 3Play)Same-day human delivery
Related answers
Try SoCaptions free.
5 minutes of transcription free, no card required.
Open editor