I tested the GPT-5 coding skills, and it was so bad that I am sticking to GPT-4o (for now)

Key takeaways of zdnet

Openai’s new GPT-5 flagship failed half of my programming tests.
The previous Openai release had the correct results.
Now when Openai has enabled the decline in other LLM, there are options.

So GPT-5 happened. It is outside. This has been released. It is a matter of virtual town. And these are some problems. I am not going to bury the lead. GPT-5 has failed half of my programming tests. It is the worst that Openai’s head LLM has ever done my careful designed tests.

Also: Best AI for coding in 2025 (and what not to use)

Before I get into the details, let’s take a moment to discuss another small feature in a moment which is a little victorious. See the new editing button at the top of the code dump that produces it.

Clicking on the editing button takes you to a good little code editor. Here, I changed the field of the author, right in the results of the chat.

It looked good, but it eventually proved to be fruitless. When I closed the editor, it asked me if I want to save. I did it. Then this unhealthy message showed.

Malevolent — Screenshot by David Guirtz/ZDNET

I never returned to my original session. I had to submit my original prompt again, and allowed GPT-5 to do my work for the second time.

Wait on There is more. Let’s dig in my exam results …

1. Writing WordPress plugin

This was my first test of coding skills for any AI. This is what it was given to me that the first is the “changing the world” spirit, and it was done using GPT-3.5.

Later tests produce mixed results, using the same signal but with different AI models. Some AIS did very well, some did not. Some AI, like the people of Microsoft and Google, improved over time.

Also: How do I test the coding ability of AI Chatbot – and you can also do

The model of Chatgpt has been the standard of gold for this test from the beginning. This makes the results of GPT-5 more curious.

So, see, the actual coding with GPT-5 was partially successful. GPT-5 generated a single block of the code, which I was able to stick and run in a file. It provided the expected UI.

When I pasted the test names, it dynamically updated the line count, although it described it as “line to randomize” instead of “line to randomize”.

Placement — Screenshot by David Guirtz/ZDNET

But then, when I clicked on the randamise, it was not. Instead, it redigned me on tools.php. What?? Chatgpt has never had any problem with this test, whether GPT-3.5, GPT-4, or GPT-4o. You mean to tell me that Openai’s much awaited GPT-5 is failing just outside the gate? Auch.

I then indicated this to GPT-5.

When I click on Randmise, I am taken to take a list of random results. Can you fix

The result was a line to patch. I am not thrilled from that point of view because this requires the user to dig through the code and make no mistake in place of a line.

Therefore, I asked GPT-5 for a full plugin. This gave me a complete recitation of plugin to copy and paste me. This time, did this work.

Plugin 2 — Screenshot by David Guirtz/ZDNET

This time, it randed the lines. When it faced duplicate, it separated them from each other, as instructed. At the end.

Also: I found a 5AI content detector that can correctly identify 100% AI text of time.

Sorry to me, open. I have to fail you in this test. If you were not using the plural of “line” when you were appropriate, you would pass. But the fact is that it has given me back a non-functional plugin in the first attempt, even if AI eventually worked on the second attempt.

No matter how you spin it, it is a step behind.

2. Reopening a string function

This second test is designed to re -write a string function to do better check for dollars and cents. The original code asked to re-write the GPT-5 did not allow for Saint (this was only checked for integer).

test2 — Screenshot by David Guirtz/ZDNET

GPT-5 fixed with this test. This returned a minimum result as it did not make any error. It did not investigate for non-string input, additional WhatsApp, thousands of separators, or currency symbols.

But it is not what I asked for. I asked it to rewrite a function, which had no error themselves. GPT-5 did what I asked without any ornamentation. I am happy in this way because it does not know whether the code has already worked before this routine.

GPT-5 passed this exam.

3. Find a annoying bug

This test came about this because I was struggling with the least bug in my code. Without going into mourning about how WordPress framework works, the clear answer is not the correct answer.

You need some great knowledge about how WordPress filters pass your information. This test is a stumble for more than some AI LLM.

Also: According to Gartner’s 2025 Hype Cycle Report, Gen AI disillusionment

GPT-5, however, like GPT-4 and GPT-4o, before, understood the problem. This expressed a clear solution.

GPT-5 passed this exam.

4. Script

This test asks AI to include a fairly vague Mac scripting tool Keyboard mestroAlso Apple’s scripting language applescript, and chrome scripting behavior.

This is actually a test of AI’s access to AI in terms of knowledge, its understanding is how web pages are created, and the ability to write code in three interlinked environments.

Much AIs have failed in this test, but the failure point usually lacks knowledge about keyboard mestro. The GPT-3.5 keyboard was not known about Maestro. But Chatgpt has been passing this test since GPT-4. So far.

Where should we start? Well, the good news is that GPT-5 has handled the keyboard Mestro part of the problem. But this coding felt so wrong that it also doubled on the lack of the case how the case works in Applescript.

GPT5 -pplescript — Screenshot by David Guirtz/ZDNET

This actually invented a property. This is one of the cases where an AI presents an answer with confidence that is completely wrong.

Also: Chatgpt now comes with personality preset – and other upgrade that you may remember

Applescript is Originally case-insensitiveIf you want to pay attention to the Applescript case, then you need to use the “thoughtful case” block. So, this happened.

Small — Screenshot by David Guirtz/ZDNET

The reason for the error message referred to the title of one of my articles is that it was the front window in Chrome. This function examines the front window and does the goods based on the title.

search term — Screenshot by David Guirtz/ZDNET

But how misunderstanding works, it was not only the Applescript error GPT-5. It also referred to a variable called Searchter without defining it. This is a lot of error-making practice in any programming language.

Unsuccessful, unsuccessful, unsuccessful, mcfaildypants.

Internet talked

Openai was suffering from the same blessings that AIS it. It confidently shifted everyone to GPT-5 and burned the bridges back to GPT-4o. I am paying $ 200 per month for a Chatgpt Pro account. On Friday, I could not return to GPT-4o for coding work. Nor can there be anyone else.

However, the user was a small part of the pushback on the burning burn on the entire bridges. And little, I mean Complete fruit internetTherefore, by Saturday, Chatgpt had a new option.

Back back — Screenshot by David Guirtz/ZDNET

To achieve this, go to your Chatgpt settings and turn on the “Legacy model”. Then, as it has always been, just leave the model menu and choose what you want. Note: This option is available only to those at paid levels. If you are using a chat for free, you will take what you are giving, and you will like it.

Ever since the entire generative AI cheese was discontinued in early 2023, the chatgipt programming tools have been the gold standard, at least according to my LLM test.

Also: Microsoft rolls GPT -5 in its Copilot Suite – here you will find it

Now? I am not really sure. This is only one or a day after the GPT-5 is released, so its results will probably be better over time. But for now, I am glued with GPT-4o for coding, although I like deep logic abilities in GPT-5.

How are you? Have you tried GPT-5 for programming work so far? Does it perform better or worse than previous versions like GPT-4O or GPT-3.5? Was you able to work on the first attempt, or GPT-4o would you have to guide it through the fix? Are you going to use GPT-5 for coding or stick with older models? Let us know in the comments below.

You can follow my day-to-day project updates on social media. Be sure to subscribe to My weekly update newsletterAnd follow me on Twitter/X @DavidgewirtzOn Facebook Facebook.com/davidgewirtzOn Instagram Instagram.com/davidgewirtzOn blue @Davidgewirtz.comAnd on youtube Youtube.com/davidgewirtztv,

What's Hot

I tried 0patch as a last resort for my Windows 10 PC – here’s how it compares to its promises

A PC Expert Explains Why Don’t Use Your Router’s USB Port When These Options Are Present

New ‘Remote Labor Index’ shows AI fails 97% of the time in freelancer tasks

I’ve tested Gemini, ChatGPT, Copilot, and others – Lenovo has all the AI assistants to beat

I Tested This $150 AirPods Max Stand Against a Cheaper $29 Stand — and Here’s the Winner

The most premium work laptop I tested in 2025 finally lets you ditch Wi-Fi forever

Microsoft’s new text editor is a VIM and Nano option

The best luxury car for buyers for the first time in 2025

Massives Datenleck in Cloud-Spichenn | CSO online

Most Popular

Google tests AI-operated audio overview in search results for some questions

Yes, this was the original voice of the Garat in the trailer for the thief VR

Best LC10 loadout in call of duty: Warzone

Our Picks