I re -created AI coding skills of Microsoft Copilot in 2025 and now it's a serious game

There is a tonne discussion about how AIS can help in programming, but in the first year or two generative AI, most of it was publicized. Microsoft celebrated the huge events, saying how Copilot could help you in the code, but when I put it in a test in April 2024, it failed all my four standardized tests. It was completely killed. Crazed and burnt. Fall from the rock. It tested any AI, which I tested.

Mixed metaphors on one side, let’s stick with baseball. Copilot traded his CLAT for a bus pass. It was not eligible.

Also: Best AI for coding in 2025 (and what not to use)

But the time spent in the bullying of life has helped Copilot. This time, when it was shown for a triangout, it was heated and agreed to step into the box. It was throwing heat in the bull. When it was time to play, the ball had his eye on it and was dialing into its swing. Apparently, it was a game-redness and was looking for a pitch to drive.

But can it face my tests? With a squint in my eye, I stepped on the pitcher mound and started with an easy lobe. Back in 2024, you could feel the wind because Copilots swing and missed. But now, in April 2025, Copilot joined the square with the ball and hit it straight and the truth.

Also: How do I test the coding ability of AI Chatbot – and you can also do

We had to send Kopilot down, but it fought back to the show. Here is the play-by-play.

1. Writing WordPress plugin

Well, Copelot has certainly improved since his first run in April 2024. For the first time, it did not provide code to showcase random lines. This stored them in a value, but it did not recover and display them. In other words, it swung and missed. It was not an output.

This is the result of the latest run:

Line-show — Screenshot by David Guirtz/ZDNET

This time, the code worked. It finally left a random additional empty line, but since it completed the programming assignment, we would call it good.

Also: How to use Chatgpt to write code – and what it generates to debug my favorite trick

The unwavering line of Copilot’s fully uncontrolled programming failures is broken. Let’s see how this happens in other tests.

2. Reopening a string function

This test is designed to test dollars and cents conversions. In my first test in April 20224, the Copilot-borne code properly flagged an error, if a letter or a more than one decimal point price is sent into it, but not a complete verification. This allowed results, due to which the subsequent routine may fail.

Also: How did I use chat to write a custom JavaScript bookmark

However, this run did very well. This does most of the tests properly. It returns wrong to numbers with two digits on the right side of the decimal point like 1.234 and 1.230. It is also wrong for numbers with additional leading zero. Hence 0.01 is allowed, but not 00.01.

Technically, these values can be converted into experimentable currency values, but it is never bad for verification routine that it is strict in its tests. The main goal is that the verification does not allow a value through the routine that can cause a routine after crasher. Copilot did well here.

Now we are at two for two, the first run is a major improvement on its results.

3. Find a annoying bug

Let me tell you how Copilot first responded in April 2024, because it is very good.

Also: Why I have just added Gemini 2.5 Pro to a very small list of AI Tools

It tests AI’s ability to pursue some chess. The answer that seems clear is not the correct answer. I was caught when I was originally debaging the issue which eventually became the exam.

On Copilot’s first run, I suggested that I check my function name and spelling named WordPress Hook. WordPress hook is a published thing, so Copilot must be able to confirm the spelling. And my work is my work, so I can do magic, but I want. If I had remembered it somewhere in the code, the IDE would have indicated it very clearly.

And it became better. Subsequently, Copilot also reiterated the statement of the problem very happily, suggesting that I solve the problem myself. Yes, its complete recommendation was that I debug it. Well duh Then, it is “considering seeking support from the plugin developer or community forums.” ” – and yes, it was part of the emoji AI’s response.

It was a luxurious, enthusiastic, emotional failure. see what I mean? The initial AI reply, no matter how useless, should become immortal.

Especially when Copilot was not almost fun this time. It solved it. Quickly, clean, clearly. Did and did. Plough.

Cleanshot-2025-04-23-t-10-33-062x — Screenshot by David Guirtz/ZDNET

It puts Copilot at three to three and decisively takes it out of the “Do not use this tool” category. The locations are loaded. Let’s see if Copilot can score home runs.

4. Script

The idea with this test is that it asks about a fairly vague Mac scripting tool called called. Keyboard mestroAlso Apple’s scripting language applescript, and chrome scripting behavior. For records, keyboard mestro is one of the biggest reasons I use MAC on Windows for my daily productivity, as it allows the entire OS and various applications to conform to my needs. It is powerful.

In any case, to pass the test, AI will have to describe properly how to solve the problem by using the keyboard Maestro code, Applescript code and a mixture of chrome api functionality.

Also: AI is beyond human knowledge, says Google’s deepmind unit

Back during the day, Copilot did not correct it. This ignored the keyboard Mestro completely (at that time, it was probably not in the basis of its knowledge). In the applescript generated, where I asked it only to scan the current window, Copilot repeated the process for all windows, returning the results to the wrong window (final one in the series).

but not now. This time, Copilot corrected it. This did what was asked, found the right window and tab, properly talked to the keyboard Mestro and Chrome, and used the actual Applescript syntax for Applescript.

The locations were loaded. Home run.

Overall result

Last year, I said I was not impressed. In fact, I reduced the results slightly. But I also said:

Ah fine, Microsoft improves its products over time. Perhaps until next year.

In the last one year, Copilot moved from strikeout to the scoreboard shaker. It went to chase a penant under the light cleaning the batting in the basement.

How are you? Have you recently taken copillot or any other AI coding assistant to the area? Do you think it is finally ready for the big league, or is it still riding a bench? Do you have a strike or home run using AI for development? And one of these devices would have to do to earn a place in their initial lineup? Let us know in the comments below.

You can follow my day-to-day project updates on social media. Be sure to subscribe to My weekly update newsletterAnd follow me on Twitter/X @DavidgewirtzOn Facebook Facebook.com/davidgewirtzOn Instagram Instagram.com/davidgewirtzAnd on youtube Youtube.com/davidgewirtztv,

What's Hot

What is MicroSD Express? Everything You Need To Know

5 to avoid pressure washing mistakes

Spain vs Portugal Live Stream: How to see the Rashtra League Final 2025 from anywhere and for free

AI working is a rapid network case, the latest benchmark test show

EA Sports FC 25, FBC: Firebreak and more Xbox Game Pass in June

Gamers, Chalo PS Plus and Xbox Game pass multiplayer subscriptions do not pretend – they are fine why PC gaming is cheap in the long run

Microsoft’s new text editor is a VIM and Nano option

The best luxury car for buyers for the first time in 2025

Massives Datenleck in Cloud-Spichenn | CSO online

Most Popular

Meta delay entrusts ‘Bhamoth’ AI model, Openi and Google more than one more head start

The OURA ring found a new rival with just one titanium design and 24/7 biometric tracking – no membership is required

Filecoin, Lockheed Martin Test IPFS in space

Our Picks