
Follow ZDNET: Add us as a favorite source On Google.
ZDNET Highlights
- DeepSeek released its V3.2 model on Monday.
- The aim is to keep AI competitive and accessible to developers.
- V3.2 heats up the race between open and proprietary models.
Chinese AI firm DeepSeek has made another splash with the release of V3.2, the latest iteration in its V3 model series.
Launched on Monday, the model, which is based on an experimental V3.2 version announced in October, comes in two versions: “Thinking,” and a more powerful “Special.” DeepSeek said V3.2 extends the capabilities of open-source AI even further. Like other DeepSeek models, it is a fraction of the cost of proprietary models, and the built-in weights can be accessed hugging face,
Also: I Tested DeepSeek’s R1 and V3 Coding Skills – and We’re Not All Ruined (Yet)
DeepSeek first made headlines in January with the release of R1, an open-source reasoning AI model that outperformed OpenAI’s O1 on several key benchmarks. Considering that the performance of V3.2 rivals even powerful proprietary models, can it shake up the AI industry once again?
What can V3.2 do
Rumors first began circulating in September that DeepSeek was planning to launch its own, more cost-effective agent to compete with companies like OpenAI and Google. Now, it looks like the competitor has finally arrived.
V3.2 is the latest version of V3, a model DeepSeek released about a year ago that also helped inform R1. According to company data published Monday, V3.2 Special outperforms industry-leading proprietary models like OpenAI’s GPT-5 High, Anthropic’s Cloud 4.5 Sonnet, and Google’s Gemini 3.0 Pro on some arguable benchmarks (for what it’s worth, Kimi K2, a free and open-source model from Moonshot, also claims to rival GPT-5 and Sonnet 4.5 in performance).
In terms of cost, access to Gemini 3 in the API costs up to $4.00 per 1 million tokens, while V3.2 Special is $0.028 per 1 million tokens. According to the company, the new model also achieved gold-level performance in the International Mathematical Olympiad (IMO) and the International Olympiad in Informatics.
“DeepSeek-v3.2 emerges as a highly cost-efficient alternative in agent scenarios, narrowing the performance gap between open and marginal proprietary models with significantly lower costs,” the company wrote in an article. research paperWhile these claims are still under debate, the sentiment continues DeepSeek’s pattern of reducing costs with each model release, which arguably threatens to undermine the immense investment that proprietary labs like OpenAI make into their models,
problems
DeepSeek said it created V3.2 in an effort to help the open-source AI community catch up to some of the technical achievements that have recently been achieved by companies building closed-source models. According to the company’s paper, the agentic and reasoning capabilities demonstrated by major proprietary models have “accelerated at a significantly faster rate” than their open-source counterparts.
Also: Mistral’s latest open-source release bets on smaller models more than larger ones – here’s why
As engineer Charles Kettering once said, “A problem well stated is a problem half solved.” In that spirit, DeepSeek began development of its new model by attempting to diagnose the reasons behind the lagging performance of open-source models, ultimately breaking it down into three factors.
First, open-source models rely on what AI researchers know as “vanilla attention” – a slow and compute-hungry mechanism for reading inputs and generating outputs, which makes them struggle with long sequences of tokens. They have a more computationally limited post-training phase, hindering their ability to complete more complex tasks. Unlike proprietary models, they have difficulty following longer instructions and generalizing across tasks, making them inefficient agents.
Solution
In response, the company introduced DeepSeek Sparse Attention (DSA), a mechanism that reduces “significant computation complexity without sacrificing long-context performance,” according to the research paper.
Also: What is sparsity? DeepSeek AI’s secret revealed by Apple researchers
With traditional vanilla attention, a model essentially generates its outputs by comparing each individual token from a query with each token in its training data – a laborious power-hungry process. By analogy, imagine that you had to dig through a huge pile of books scattered on the lawn to find a particular sentence. You can do this, but it will take a lot of time and require careful examination of a large number of pages.
The DSA approach strives to work smarter, not harder. It is deployed in two phases: An initial “Lightning Indexer” search, which performs a high-level scan of the tokens in its training data to identify the small subset that may be most relevant to a particular query. It then drills down into that subset with all its computational power to find what it’s looking for. Instead of starting with a huge stack of books, you can now go to a well-organized library, go to the relevant section, and make a much less stressful and lengthy search to find the paragraph you are looking for.
The company then aimed to solve the latter issue of training by building “expert” models to test and refine V3.2’s capabilities in writing, general question-answering, mathematics, programming, logical reasoning, agentic tasks, agentic coding, and agentic search. They are like tutors who are tasked with transforming the model from a generalist to a multi-expert.
boundaries
According to the research paper, DeepSeek v3.2 “effectively bridges the gap between computational efficiency and advanced reasoning capabilities” through open-source AI and “opens up new possibilities for robust and generalizable AI agents”.
Also: Stop saying AI hallucinates – it doesn’t. And mischaracterization is dangerous
However, there are some caveats. For one thing, the new model’s “world knowledge” – the breadth of practical understanding about the real world that can be inferred from the collection of training data – is much more limited than that of leading proprietary models. It also requires more tokens to generate outputs that match the quality of marginal ownership models, and it struggles with more complex tasks. DeepSeek says it plans to continue bridging the divide between its own open-source model and its proprietary counterparts by increasing the computation during pretraining and refining its “post-training recipe.”
However, despite these limitations, the fact that a company – and a company based in China – has created an open-source model that can compete with the reasoning capabilities of some of the most advanced proprietary models currently on the market is a huge deal. This reiterates the growing evidence that the “performance gap” between open-source and closed-source models is not a fixed and unresolved fact, but rather a technical discrepancy that can be bridged through creative approaches to pretraining, attention, and posttraining.
Even more importantly, its built-in load is almost free for developers and building on that could undermine the basic sales pitch that has so far been deployed by developers leading the industry’s closed-source model: It’s worth paying to access these tools, because they’re the best on the market. If open-source models eclipse proprietary models, it won’t make sense for most people to continue paying for the latter.

