AI companies are struggling to dominate the industry, but sometimes they are also struggling in Pokémon gym.
Both Google and Anthropic study how their latest AI models navigate the initial Pokémon game, the results can be as entertaining as they are enlightening – and this time, Google Deepmind is Written in a report That Gemini 2.5 Pro is nervous when its Pokémon is close to death. It may experience the performance of AI to “qualitatively observable decline in model’s logic capacity” according to the report.
AI benchmarking – or, the process of comparing the performance of various AI models – is a suspicious art that often provides very little reference to the real abilities of a given model. But some researchers believe how AI models play video games may be useful (or, very minimized, like funny).
In the last several months, two developers, unaffected two developers with Google and Anthropic, have set the concerned twitch streams, called “” “.Mithun played Pokémon” And “Cloud plays Pokémon“Where anyone can see in real time because AI tries to navigate children’s video games from 25 years ago.
Each stream displays the “logic” process of AI – or, a natural language translation AI evaluates a problem and comes to a reaction – provides us with insight in the way as these models work.

While the progress of these AI models is impressive, they are still not very good in playing Pokémon. Gemini takes hundreds of hours to argue through a sport that a child can complete a rapidly in a short time.
The interesting thing about watching a Pokémon game AI Navigate is that it is not so much about the time of completion, but how it behaves.
The report states, “During Playthrrow, Gemini 2.5 Pro comes in various situations, which causes the model to follow ‘terror’,” the report states.
This condition of “panic” resulted in the performance of the model to deteriorate, as the AI can suddenly stop using some devices at its disposal for a stretch of the gameplay. While the AI does not think or does not think of emotions, its actions are copied in the way a human can decide in a hurry under stress – an attractive, still unstable reaction.
The report said, “This behavior has been done in a sufficiently different examples that the members of Twitch Chat have actively seen when it is happening,” the report states.
Cloud has also demonstrated some curious behaviors in his journey in Kanto. In one example, AI raised on the pattern that when all its Pokémon get out of health, the player’s character would “white” and return to a Pokémon center.
When the Cloud Mount Moon got stuck in the cave, it wrongly envisages that if it deliberately fainted all its Pokémon, it would be taken to the Pokémon Center in the next city in the cave.
However, it is not how the game works. When all your Pokémon die, whatever Pokémon Center you use in recent, closest geographically use. The audience saw in the horror as AI essentially tried to kill himself in the game.
Despite its shortcomings, there are some ways that AI can improve human players. As the release of Gemini 2.5 Pro, AI is capable of resolving riddles with impressive accuracy.
With some human assistance, AI created agentic equipment – to solve the boulder puzzle of the game and find efficient routes to reach a destination – extended examples of Gemini 2.5 Pro towards specific tasks.
The report said, “With a description of a sign that describes the boulder physics and verifying a valid path, Gemini 2.5 Pro is able to one of these complex boulder riddles, which are required to progress through the Vijay Road,” the report states.
Since Gemini 2.5 Pro did a lot of work in creating these devices on its own, Google suggests that the current model may be able to create these devices without human intervention. Who knows, perhaps Gemini will treat himself in creating a “not nervous” module.