Is the token count exact?

Yes. It uses OpenAI's own tiktoken byte-pair encodings (o200k_base and cl100k_base) via the open-source gpt-tokenizer library, so the count matches what the API charges — not an estimate like 'characters ÷ 4'.

Which encoding should I pick?

Pick o200k_base for GPT-4o, GPT-4o mini, GPT-4.1, and the o1/o3/o4 reasoning models. Pick cl100k_base for GPT-4, GPT-4 Turbo, GPT-3.5 Turbo, and the text-embedding-3 / ada-002 embedding models.

Is my text uploaded to a server?

No. The tokenizer is loaded once from a public CDN and then runs entirely in your browser. The text you paste is never uploaded or sent to an API, so it is safe for prompts and private data.

Why does one token sometimes show a � character?

Some tokens are partial byte sequences of a multibyte character (common with emoji and non-Latin scripts). Shown alone, that fragment decodes to a replacement character, but the surrounding tokens still combine into the correct text — the count stays exact.

How many tokens is a word or a page?

For English, roughly 1 token ≈ 0.75 words (about 4 characters). A page of English is often 400–600 tokens. Japanese and other non-Latin text usually uses more tokens per character. Use the live chars-per-token readout to gauge your own text.

Yes, it is completely free and instant. No sign-up and no installation required.

ChatGPT Tokenizer — count OpenAI tokens, free & exact.

The ChatGPT Tokenizer counts exactly how many tokens your text uses for OpenAI models. Paste any text and it returns the precise token count — the same number the API bills you for — plus a colored, token-by-token view so you can see where the model splits your words. It runs entirely in your browser using tiktoken (OpenAI's own byte-pair encoding), so it is exact, instant, free, and private.

What is a token?

A token is the unit OpenAI models read and are billed in. It is usually a short chunk of a word — sometimes a whole short word, sometimes a few characters, sometimes just a space plus the start of a word. A rough rule of thumb for English is 1 token ≈ 4 characters ≈ 0.75 words, but the only accurate way to know is to run the real encoder, which is what this tool does.

Which encoding does my model use?

OpenAI models share a small number of encodings. Pick the one that matches your model:

Encoding	Models	Typical use
`o200k_base`	GPT-4o, GPT-4o mini, GPT-4.1, o1 / o3 / o4 (and newer)	Current chat and reasoning models; newest, most efficient tokenizer
`cl100k_base`	GPT-4, GPT-4 Turbo, GPT-3.5 Turbo, text-embedding-3, text-embedding-ada-002	Previous-generation chat models and the current embedding models

If you are unsure, use o200k_base — it powers the models most people use today (GPT-4o and GPT-4.1).

How to count your tokens

Paste or type your text into the box.
Choose the model family (o200k_base for GPT-4o / GPT-4.1 / o-series, cl100k_base for GPT-4 / GPT-3.5 Turbo).
Read the token count at the top, alongside characters, words, and characters-per-token.
Scan the colored chips below to see exactly how the text is split into tokens; toggle Show token IDs to view the raw integer id of each token.
Click Copy count to copy just the number, or Copy token IDs to copy the full id list.

Example: input → output

Input:

Tokenization is fun!

With o200k_base this encodes to 5 tokens: Token, ization, is, fun, !. Notice that "Tokenization" splits into two tokens and that the leading space is part of the is and fun tokens — that is why token counts don't line up with word counts.

Why count tokens?

Cost: OpenAI bills per token, so token count is your true cost driver — far more accurate than counting characters.
Context limits: Every model has a maximum context window measured in tokens. Counting first tells you whether a prompt plus its expected response will fit.
Prompt engineering: Trimming a prompt from 1,200 to 800 tokens is a measurable win you can see live as you edit.

Is it exact and private?

Yes to both. The tool uses OpenAI's real tiktoken encodings (o200k_base and cl100k_base) through the open-source gpt-tokenizer library — not a "divide by four" estimate — so the count matches the API. The tokenizer code is loaded once from a public CDN and then runs on your device; the text you paste is never uploaded, which makes it safe for private prompts and confidential data.

A small note on the colored view: some tokens are partial byte sequences of a multibyte character (common with emoji and Japanese or other non-Latin scripts). On its own, such a fragment shows a � replacement character, but the neighboring tokens still reconstruct the correct text and the count stays exact.

Operated by

Turnint AI Tools is a suite of free tools built and operated by unbounded pioneering inc, the company behind the Turnint AI agent platform.

Ryosuke SuzukiFounder & CEO

Founder & CEO of Unbounded Pioneering Inc., the company behind the Turnint AI agent platform, and an expert in machine learning and AI product development. He began his career in machine learning research at a university laboratory, then designed and built large-scale products as a software engineer at PLAID, Rakuten, and Recruit, while also driving new business development. Now specializing in generative AI and AI agents, he works across both engineering and business development, and is a named inventor on multiple granted patents in web technology.

Named inventor on granted patents JP6887648 & JP7480958 · Patent pending on Turnint AI technology

LinkedIn X About us →

ChatGPT Tokenizer

What is a token?

Which encoding does my model use?

How to count your tokens

Example: input → output

Why count tokens?

Is it exact and private?

FAQ

Operated by

Get in touch

Thanks for reaching out

What we can help with

Talk to us online

ChatGPT Tokenizer

What is a token?

Which encoding does my model use?

How to count your tokens

Example: input → output

Why count tokens?

Is it exact and private?

FAQ

Is the token count exact?

Which encoding should I pick?

Is my text uploaded to a server?

Why does one token sometimes show a � character?

How many tokens is a word or a page?

Is it free?

Operated by

Get in touch

Thanks for reaching out

What we can help with

Talk to us online