ChatGPT Tokenizer

LLM tokenizers

Count the exact number of tokens your text uses for OpenAI models, with a colored token view. Everything runs in your browser with tiktoken; your text is never uploaded.

The ChatGPT Tokenizer counts exactly how many tokens your text uses for OpenAI models. Paste any text and it returns the precise token count — the same number the API bills you for — plus a colored, token-by-token view so you can see where the model splits your words. It runs entirely in your browser using tiktoken (OpenAI's own byte-pair encoding), so it is exact, instant, free, and private.

What is a token?

A token is the unit OpenAI models read and are billed in. It is usually a short chunk of a word — sometimes a whole short word, sometimes a few characters, sometimes just a space plus the start of a word. A rough rule of thumb for English is 1 token ≈ 4 characters ≈ 0.75 words, but the only accurate way to know is to run the real encoder, which is what this tool does.

Which encoding does my model use?

OpenAI models share a small number of encodings. Pick the one that matches your model:

EncodingModelsTypical use
o200k_baseGPT-4o, GPT-4o mini, GPT-4.1, o1 / o3 / o4 (and newer)Current chat and reasoning models; newest, most efficient tokenizer
cl100k_baseGPT-4, GPT-4 Turbo, GPT-3.5 Turbo, text-embedding-3, text-embedding-ada-002Previous-generation chat models and the current embedding models

If you are unsure, use o200k_base — it powers the models most people use today (GPT-4o and GPT-4.1).

How to count your tokens

  1. Paste or type your text into the box.
  2. Choose the model family (o200k_base for GPT-4o / GPT-4.1 / o-series, cl100k_base for GPT-4 / GPT-3.5 Turbo).
  3. Read the token count at the top, alongside characters, words, and characters-per-token.
  4. Scan the colored chips below to see exactly how the text is split into tokens; toggle Show token IDs to view the raw integer id of each token.
  5. Click Copy count to copy just the number, or Copy token IDs to copy the full id list.

Example: input → output

Input:

Tokenization is fun!

With o200k_base this encodes to 5 tokens: Token, ization, is, fun, !. Notice that "Tokenization" splits into two tokens and that the leading space is part of the is and fun tokens — that is why token counts don't line up with word counts.

Why count tokens?

  • Cost: OpenAI bills per token, so token count is your true cost driver — far more accurate than counting characters.
  • Context limits: Every model has a maximum context window measured in tokens. Counting first tells you whether a prompt plus its expected response will fit.
  • Prompt engineering: Trimming a prompt from 1,200 to 800 tokens is a measurable win you can see live as you edit.

Is it exact and private?

Yes to both. The tool uses OpenAI's real tiktoken encodings (o200k_base and cl100k_base) through the open-source gpt-tokenizer library — not a "divide by four" estimate — so the count matches the API. The tokenizer code is loaded once from a public CDN and then runs on your device; the text you paste is never uploaded, which makes it safe for private prompts and confidential data.

A small note on the colored view: some tokens are partial byte sequences of a multibyte character (common with emoji and Japanese or other non-Latin scripts). On its own, such a fragment shows a � replacement character, but the neighboring tokens still reconstruct the correct text and the count stays exact.

Operated by

unbounded pioneering inc
Turnint AI

Turnint AI Tools is a suite of free tools built and operated by unbounded pioneering inc, the company behind the Turnint AI agent platform.

Ryosuke Suzuki
Ryosuke SuzukiFounder & CEO

Founder & CEO of Unbounded Pioneering Inc., the company behind the Turnint AI agent platform, and an expert in machine learning and AI product development. He began his career in machine learning research at a university laboratory, then designed and built large-scale products as a software engineer at PLAID, Rakuten, and Recruit, while also driving new business development. Now specializing in generative AI and AI agents, he works across both engineering and business development, and is a named inventor on multiple granted patents in web technology.

Named inventor on granted patents JP6887648 & JP7480958 · Patent pending on Turnint AI technology

Get in touch

Thanks for reaching out

Thank you for your interest in our company. A member of our team will get back to you within one business day.

What we can help with

  • Adopting and getting the most out of Turnint AI
  • A demo or trial of Turnint AI
  • AI adoption in general (beyond our own product, too)
  • Alliances and partnerships
  • Any other questions

Talk to us online

You can also book a meeting directly from the calendar.

Pick a template or write your own message.