
Key takeaways of zdnet
- The bated rollout of the GPT-5 does not suggest the supervision.
- GPT-5 represents older technological progress.
- Scholars are debuting AI with detailed analysis.
About a year ago, Openai CEO Sam Altman declared artificial “superintendent” around “around the corner”.
Too: Sam Altman says that eccentricity is imminent – why is it here
Then, last June, he trumped the arrival of superintending, Writing in a blog post: “We have recently created those systems that are smart in many ways than people.” But this rhetoric that is taking rapid shape is rapidly taking shape, which is a bottom debut of the much awaited GPT-5 model from ALTMAN’s AI company, Openai.
(Disclosure: ZDNET’s original company Ziff Davis filed a case of April 2025 against Openai, alleging that it violates Ziff Davis copyright training and operating its AI system.)
A very low rollout
Since the release, the new AI model has received a negative response and a proper amount of negative press-it is a matter of discretion that, the company’s first open-source model reception was widely acclaimed in six years ago.
“Openai’s GPT-5 model was to upgrade a world-ups for its wildly popular and uncertain chatbott,” Write Wired’s Will Night. “But for some users, last Thursday’s release felt like a crowded downgrade, the new chat introduced a thin personality and made surprisingly dumb mistakes.”
Too: Openai’s GPT-5 is now free for everyone: how to access and everything else we know
There were simple technical snacks, such as a broken mechanism to switch between GPT-5 and GPT-4o, and users complain of “dull reactions, hallucinations and stunning errors”.
As Night states, Hype has been constructing for GPT-5 since its predecessor, GPT-4 in March 2023. That year, Altman emphasized a large-scale technical challenge by lending a kind of moon shot impression with GPT-5.
At a press conference that year after the company’s first developer conference held in San Francisco, Altman said, “Before we can make a model, which we will call GPT -5, whatever is, whatever is there is still a lot.”
Progress, but no moon shot
What has been given seems to be an improvement, but there is nothing like the moon shot.
Too: Openai CEOs upwards for GPT-5, probable for new types of consumers hardware
On one of the most respected benchmark tests of artificial intelligence, which “Abstraction and Reasoning Corpus for Artificial General Intelligence,” or called Arc -AGI -2, GPT -5 has scored better than some predecessors, but also developed below Grock -4 developed by XA’s XA’s XA, according to X, X, X, X. According to the manufacturer of Arc-Agni. Francois chollet,
Groke 4 is still a state -of -the -art on the Arc -AGI -2 among the Frontier Model.
15.9% for GROK 4 vs 9.9% for GPT-5. pic.twitter.com/wsezrszsjw
– François Chollet (@fcholllet) August 7, 2025
On the old model of AGI test, Arc -AGI -1, GPT -5 scored 67.5% correct, Chollet wroteWhich was scored in an old Openai model, O3, December that is below 76%.
GPT-5 Arc -gi Semi on Private Eveval
GPT -5
* Arc -AGI -1: 65.7%, $ 0.51/function
* Arc -AGI -2: 9.9%, $ 0.73/workGPT -5 Mini
* Arc -AGI -1: 54.3%, $ 0.12/function
* Arc-AGI-2: 4.4%, $ 0.20/functionGPT -5 nano
* Arc -AGI -1: 16.5%, $ 0.03/function
* ARC-AGI-2: 2.5%, $ 0.03/function pic.twitter.com/knl7tofyef– Arc Award (@ARCPRIZE) August 7, 2025
In coding, each new AI model usually shows some progress.
David Gewirtz of ZDNET relates in his test that GPT-5 is actually one step back. David provided “a jump” to GPT -5 in the analysis of code repository, but said it was not “game -changer”.
What’s going on over here? The promotion of Altman and others about the superintendent is only for progress.
“Overdue, overHipd and weak,” tireless General AI critic Gary Marcus wrote On his option“But this time, the reaction was different. Because hopes were through the roof, a large number of people saw GPT -5 as a major latdown.”
AI is pushing scholars back on publicity
For all negative presses, this is unlikely to Altman and others will give up rhetoric about superintending. However, a true “cognitive” lack of success in GPT -5, after such expectation, can examine the words closely, often thrown around, such as “thinking” and “logic”.
Press release For GPT-5 from Openai, there is an emphasis on how the model is called rational, where the AI models produce verbose outputs about the process of arriving in response to a signal.
“When using logic, GPT-5 is better than experts in almost half cases,” the company said.
Too: Openai returns to its open-source roots with the new open-weight AI model, and this is a big thing
Industry’s research teams have recently pushed back on the claims of logic.
One in Widely quoted research paper from Apple last monthResearchers at the company concluded that the so -called large logic models, LRMs, do not continuously “cause” in any sense that no one will expect a colloquial period. Instead, the programs become uncertain as to how they see rapidly complex problems.
“The accurate calculation of the LRM has limitations: they fail to use scare algorithms and causes inconsistently to use scales and problems,” the lead author Parsen Shojei and the team wrote.
As a result, “Frontier LRM faces a complete accuracy collapse beyond some complications.”
Similarly, Researcher Ghengshui Jhao and Team of Arizona State University Write in a report last week The string of the Verbose output manufactured by LRMS “chain-off-three”, often leads to the notion that they are deliberately engaged in inferior processes. ” But, they conclude, the reality is actually “more superficial than” as it appears. ”
Too: This free GPT -5 feature is flying under the radar – but it’s a game changer for me
Such obvious arguments are “a brittle mritrishna who disappears when pushed beyond the training distribution,” and concludes after studying the results of the team and team models and their training data.
Such technical assessments are challenging Hyperbole from Altman and others who exploit the perceptions of intelligence with accidental, unbalanced claims.
This average person is also thrown here and there to argue Hyperbole and to give a lot of attention to the Cavalier method that words like Superintendent are thrown here and there. Whenever GPT-6 comes, it can do for more appropriate expectations.

