Costs hidden in AI finance: Why cloud model can be 20-30% more expensive than GPT in enterprise settings

This is a well-known fact that different models can use different tookar. However, limited analysis has been done on how the process takes place ,Token, It varies in these tookar itself. Do all the tokeners result in a similar number of tokens for a given input text? If not, how different the tokens are? How important are the differences?

In this article, we detect these questions and examine the practical implications of tokenness variability. We present a comparative story of two frontier model families: OpeniKi chatrap vs anthropicCloud. Although their advertised “cost-token” figures are highly competitive, experiments suggest that anthropic models can be 20–30% more expensive than the GPT model.

Costs hidden in AI finance: Why cloud model can be 20-30% more expensive than GPT in enterprise settings

API Pricing -Cloud 3.5 Sonnet vs. GPT -4 o

By June 2024, the pricing structure for these two advanced frontier models is highly competitive. Both anthropic’s cloud 3.5 Sonnet and GPT-4o of Openai have the same cost for output tokens, while Cloud 3.5 Sonnet provides 40% lower cost for input tokens.

Source: Convenience

Hidden “Tokar Inability”

Despite the low input token rates of the anthropic model, we noticed that the total cost of running experiments (on the set of fixed prompts) with GPT -4O (on the given set of fixed prompts) is very cheap.

Why?

Anthropic toknerizer goes to break the same input in more tokens than the tokener of Openai. This means that, for similar signs, anthropic models produce much more tokens than their Openai counterparts. Consequently, while cloud 3.5 can reduce per-token costs for the input of 3.5 sonnet, increased tokening can offset these savings, which can lead to high overall costs in cases of practical use.

This hidden cost stems from the information arising from the information of anthropic tokner, which often uses more tokens to represent the same material. Token count inflation has significant effects on the cost and reference window use.

Domain-dependent incest

Different types of domain materials are token in a different way by the toxor of anthropic, leading to different levels of counting of token compared to the model of openi. The AI research community has noted similar token differences HereWe tested our findings on three popular domains, namely: English articles, codes (pythons) and mathematics.

Cloud 3.5 Sonnet Tokenizer % token overhead (relative to GPT-4) Source: Lavanya Gupta

When Cloud 3.5 compare the sonnet to the GPT-4o, the degree of toxor disability varies greatly in the domain. For English articles, the cloud tokener produces about 16% more tokens than the GPT-4o for the same input text. This overhead grows rapidly with more structured or technical materials: for mathematical equations, the overhead stands at 21%, and for the python code, the cloud produces 30% more tokens.

This variation arises because some material types, such as technical documents and codes, are often patterns and symbols that have pieces of anthropic tokinar in small pieces, which leads to a high token count. In contrast, more natural language materials demonstrate a low token overhead.

Other practical implications of tookner disability

Beyond the direct implication on costs, there is also an indirect impact on the use of reference windows. While the anthropic models claim a large reference window of 200K tokens, as in contrast to the 128k tokens of OpenaiI, due to the verbosity, effectively usable token location can be smaller to anthropic models. Therefore, the “advertised” reference window size versus “effective” can potentially be a small or large difference in reference window sizes.

Implementation of tokner

Use GPT model Bite couple encoding (bPE),Which merges with a cum-cum-character couple to make tokens. In particular, the latest GPT models use Open-SUS O200K_Base Torkener. The actual token used by GPT-4o (in Tiktocen Tokenizer) can be seen Here,

JSON
 
{
    #reasoning
    "o1-xxx": "o200k_base",
    "o3-xxx": "o200k_base",

    # chat
    "chatgpt-4o-": "o200k_base",
    "gpt-4o-xxx": "o200k_base",  # e.g., gpt-4o-2024-05-13
    "gpt-4-xxx": "cl100k_base",  # e.g., gpt-4-0314, etc., plus gpt-4-32k
    "gpt-3.5-turbo-xxx": "cl100k_base",  # e.g, gpt-3.5-turbo-0301, -0401, etc.
}

Unfortunately, much can not be said about anthropic tookner because their toxor is not available as GPT directly and easily. anthropic His token counting API released in December 2024However, it was soon demolished in 2025 editions.

Lattenode The report stated that “anthropic uses a unique token with only 65,000 token variations, compared to 100,261 token variations of openiI for GPT -4.” it Collab notebook The python code is included to analyze the difference of tokens between the GPT and the cloud model. Another tool This enables interfaces with some normal, publicly available tocons that validate our findings.

Counting of tokens (without implementing the actual model API) and the ability to estimate the cost of the budget is important for AI enterprises.

key takeaways

The competitive pricing of the anthropic comes with hidden costs:
While the cloud of anthropic offers a 40% less input token cost compared to the GPT-4o of 3.5 Sonnet Openai, this clear cost benefit can be misleading due to the difference in how the input text is token.
Hidden “Tokar Inability”:
Anthropic models are naturally more VibrantFor those businesses that process large versions of the lesson, it is important when evaluating the correct cost of deploying the model to understand this discrepancy.
Domain-dependent tokener disability:
When selecting between openi and anthropic models, Evaluate the nature of your input textFor natural language functions, the cost difference may be minimal, but technical or structured domains can cost significantly higher with anthropic models.
Effective Reference Window:
Due to the action of anthropic tokner, its large advertised 200K reference can offer less effective useable space than 128k of window openai, for which it is leading for Possibility Gaps,

Anthropic did not respond to the requests of venturebeat for comment by press time. If they answer, we will update the story.

Daily insights on business use cases with VB daily

If you want to impress your boss, VB daily has covered you. We give you the scoop inside what companies are doing with generative AI, from regulatory changes to practical deployment, so you can share insight for maximum ROI.

Read our privacy policy

Thanks for membership. See more VB newsletters here.

There was an error.

Scope	Model input	GPT token	Cloud token	% Token overhead
English article		77	89	~ 16%
Code (Python)		60	78	~ 30%
Mathematic		114	138	~ 21%

What's Hot

10 clever methods Google Lens makes life easier on Android

Fortnite Chapter 6 All Shadow Quests Quests in season 3 – disastrous

Scientists discovered the heaviest proton-emergent nucleus after nearly 30 years.

Intel advanced packaging for large AI chips

Forget Otter.ai: Chat only entered the meeting room

AI working is a rapid network case, the latest benchmark test show

Microsoft’s new text editor is a VIM and Nano option

The best luxury car for buyers for the first time in 2025

Massives Datenleck in Cloud-Spichenn | CSO online

Most Popular

How to hack a phone: 7 general attack methods explained

NYT Connection today and North – Saturday, 17 May (706)

NBA Playoff 2025: How Cultics vs. Nax Game 6 Watch Tonight

Our Picks