Anthropic's AI Claude plays Pokémon on Twitch and surprises with his reasoning ability

Last update: 28/02/2025

  • Anthropic's Claude 3.7 Sonnet has been tested playing Pokémon Red on Twitch.
  • The AI ​​model has demonstrated significant progress in reasoning and decision making.
  • He managed to defeat the first three Gym Leaders in the game, something that previous versions failed to achieve.
  • Anthropic highlights the use of video games as an evaluation method for artificial intelligence.
Claude Ai's most critical moment in Pokémon

Anthropic has surprised the world of artificial intelligence by demonstrating how far its new Claude 3.7 Sonnet model can go in complex tasks. On this occasion, and as part of an innovative test of capabilities, The AI ​​system was put to play Pokémon Red in Twitch, where viewers were able to follow the progress live.

The experiment seeks to show how artificial intelligence can make strategic decisions and learn to navigate a dynamic environment without human intervention. This marks a milestone compared to previous versions of the model, which had failed to overcome early barriers within the game.

Exclusive content - Click Here  Complete Guide to Using Google Gemini on iPhone

Claude 3.7 Sonnet demonstrates advances in reasoning

Claude 3.7 Sonnet

To evaluate improvements to the AI ​​model, Anthropic provided it with certain key tools: Display pixel input, basic memory and button control. Thanks to these elements, Claude was able to interpret what was happening in the game and make decisions based on its internal logic.

In previous models, such as the Claude 3.0 Sonnet, artificial intelligence He didn't even manage to leave the main character's house.However, in this new iteration, the system has advanced considerably, managing to beat Brock, Misty, and Lt. Surge, the first three Gym Leaders in the game.

A journey of 35.000 actions within the Pokémon world

Claude AI plays Pokémon

Claude's journey in Pokémon Red was not easy. According to data provided by Anthropic, the AI ​​performed around 35.000 shares until it was possible to overcome the Ciudad Carmín stage. The exact time that this process took was not specified, but the model's capacity to adapt to changes and learn patterns during their performance.

Exclusive content - Click Here  Ne Zha 2 breaks records and approaches the $1.000 billion milestone

Using video games to assess artificial intelligence is not new. However, this experiment reinforces the idea that These environments can become fundamental tools to measure progress in AI models capable of reasoning and adapting.

Beyond the game: Claude 3.7 Sonnet and its real-world applications

Anthropic presents Claude 3.7 Sonnet-2

In addition to demonstrating skills within Pokémon Red, Anthropic has highlighted that its AI model is capable of solve complex problems in fields such as mathematics, programming and coding. As part of its improvements, a feature called Claude Code has been added, which allows AI to search for and edit code, run tests and even work with tools such as GitHub.

For those interested in testing the capabilities of the model, Claude 3.7 Sonnet is now available on a variety of platforms, including Claude's app, Anthropic API, Amazon Bedrock and Google Cloud, maintaining the same access cost as its previous version.

Claude 3.7 Sonnet
Related article:
Anthropic Introduces Claude 3.7 Sonnet: Hybrid AI with Advanced Reasoning

The fact that Claude 3.7 Sonnet has managed to overcome key stages within Pokémon Red reinforces the idea that Artificial intelligence is advancing by leaps and bounds in terms of reasoning and learning. This type of testing opens the door to New real-world applications, from automating tasks to solving complex problems without human intervention.

Exclusive content - Click Here  How can I fix an Alexa understanding or voice error issue?