Transcribe a podcast to text
Turn any podcast episode into text by pasting the show's RSS feed URL (or a direct audio link) — the episode audio is streamed to your browser and transcribed on-device with a Whisper AI model, so you get plain text, timestamps, and subtitles (SRT/VTT) without uploading anything.
How it works
You give the tool a URL, it gets the audio, and Whisper turns speech into text right in your browser:
- Paste a podcast RSS feed URL and choose an episode, or paste a direct audio URL (
.mp3,.m4a,.aac,.wav, …) to skip straight to transcription. - The episode audio is streamed through a same-origin proxy (needed only to get past the CORS rules on podcast CDNs) and decoded to 16 kHz mono.
- The audio is split into ~2-minute parts and transcribed part-by-part, so the transcript streams into view instead of making you wait for the whole episode.
- Copy or download the result as plain text, timestamped lines, SRT, or WebVTT.
The transcription model runs entirely on your device. Nothing is stored on our servers.
Steps
- Find the podcast's RSS feed URL (most shows and directories link it) or a direct link to the episode's audio file.
- Paste it into the tool and press Load.
- Pick the episode you want (skipped automatically for a direct audio URL).
- Choose a model and, for multilingual models, the spoken language.
- Press Transcribe and watch the text stream in.
- Switch to the SRT or VTT tab if you need subtitles, then copy or download.
RSS feed vs. direct audio URL
| Input | What happens | Best for |
|---|---|---|
| RSS feed URL | Lists every episode with downloadable audio; you pick one | Browsing a show and choosing an episode |
Direct audio URL (.mp3, .m4a, …) | Skips the list and transcribes that file immediately | You already have the episode's audio link |
Spotify and Apple Podcasts page links don't work: Spotify doesn't expose a downloadable audio file, and Apple page links aren't RSS feeds. Use the RSS feed URL or a direct audio URL instead.
Which model should I choose?
| Model | Languages | First download | Best for |
|---|---|---|---|
| Balanced (Whisper base) | Multilingual, incl. Japanese | ~200 MB | The default for most shows |
| Accurate (Whisper large-v3-turbo) | Multilingual, incl. Japanese | ~760 MB | Highest quality; use a WebGPU browser |
| Fast (Whisper tiny.en) | English only | ~120 MB | Quick drafts of English shows |
The model downloads once on first use and is cached for later runs.
Example
Paste a show's RSS feed such as https://feeds.example.com/my-show.xml, choose the latest episode, keep the Balanced model with language set to Auto-detect, and press Transcribe. A one-hour episode streams in as roughly thirty ~2-minute parts; when it finishes you switch to the SRT tab and download subtitles ready to drop into a video editor.
Privacy
The only server touch is a thin proxy that streams the episode file to your browser, which is required because podcast CDNs block cross-origin browser downloads. The audio is never persisted, and the transcription happens entirely on your device — a recent Chrome or Edge (WebGPU) is fastest, and a desktop browser is recommended for very long episodes.
