Transcribe audio to text

Audio tools

Upload an audio file and get back text, timestamps, or subtitles (SRT/VTT). An on-device Whisper AI model transcribes it right in your browser — your audio is never uploaded.

Transcribe audio to text in your browser

This tool turns an audio file into text — plain transcript, timestamped lines, or ready-to-use subtitles (SRT / VTT) — using OpenAI's Whisper model running directly on your device. Drop in an MP3, WAV, M4A, OGG, FLAC, or WebM file and get the text back without uploading anything. Your audio never leaves your browser; only the AI model is downloaded (once) from a CDN, then everything runs locally.

How it works

The tool runs an open-source speech-recognition model — Whisper (OpenAI) or the lightweight Moonshine (Useful Sensors), both MIT-licensed — in your browser through Transformers.js, inside a Web Worker so the page never freezes. Your file is decoded and down-sampled to 16 kHz mono audio, split into 30-second chunks, and transcribed chunk by chunk. You pick the model that matches your language and quality needs:

ModelLanguagesFirst downloadSubtitlesBest for
Fast (whisper-tiny.en)English only~120 MBYesQuick English drafts, low-power devices
Balanced (whisper-base)Multilingual, incl. Japanese~200 MBYesEveryday default
Accurate (whisper-large-v3-turbo)Multilingual, incl. Japanese~760 MBYesHighest quality; WebGPU recommended
Ultra-light (moonshine-tiny)English only~75 MBNoShort English clips, fastest, plain text only
Light (moonshine-base)English only~155 MBNoShort English clips, a bit more accurate

The two Moonshine models (Useful Sensors, MIT) are an ultra-light option built for on-device English speech. They return plain text only — no timestamps, so no SRT/VTT — and are meant for short clips rather than long recordings. For Japanese, or when you need subtitles or long-form audio, use a Whisper model.

Because the model executes locally:

  • Your audio never leaves your computer — nothing is sent to a server.
  • After the first download, the model is cached and works offline.
  • WebGPU browsers (recent Chrome, Edge) run much faster than the CPU (WebAssembly) fallback.

Steps

  1. Drop an audio file onto the upload area (or click to choose one).
  2. Pick a model — Balanced is a good multilingual default; use Accurate for the best Japanese quality, or Fast for quick English.
  3. For multilingual models, choose the language (or leave it on Auto-detect).
  4. Click Transcribe. On the first run of each model, the browser downloads it — you'll see a progress percentage.
  5. When it finishes, switch between Text, Timestamped, SRT, and VTT.
  6. Copy or Download the format you need.

Example: upload a 10-minute interview recording (interview.m4a) → download interview.srt, a subtitle file you can load straight into a video editor.

Output formats

FormatContainsBest for
TextPlain transcript, no timingsNotes, articles, copy-paste
Timestamped[start → end] text per segmentSkimming, meeting minutes, quoting
SRTNumbered subtitle cues with , millisecond separatorVideo editors, most players
VTTWebVTT cues with . millisecond separatorHTML5 <track>, web video

When to use this vs. a server tool

SituationBest choice
Sensitive or private recordingsThis tool — the audio never leaves your browser
No account / no upload wantedThis tool — fully client-side, free
Subtitles for a videoThis tool — export SRT or VTT directly
Hundreds of hours, automated pipelineA server/API tool — batch throughput beyond one browser

Tips for the best transcript

  • Clear speech and low background noise transcribe most accurately.
  • For Japanese or mixed-language audio, prefer the Accurate model and set the language explicitly.
  • If the first run feels slow, that's the one-time model download; the next file is much faster.
  • Long files take longer because audio is processed in 30-second chunks — a WebGPU browser helps a lot here.

Everything here runs in your browser. Your audio is never uploaded — that's the whole point.

Get in touch

Thanks for reaching out

Thank you for your interest in our company. A member of our team will get back to you within one business day.

What we can help with

  • Adopting and getting the most out of Turnint AI
  • A demo or trial of Turnint AI
  • AI adoption in general (beyond our own product, too)
  • Alliances and partnerships
  • Any other questions

Talk to us online

You can also book a meeting directly from the calendar.

Pick a template or write your own message.