- OpenAI’s o3 model won a five-day poker tournament with nine AI chatbots
- The O3 model won by playing the most consistent game
- Most top language models handled poker well, but struggled with bluffing, position and basic math
In a digital showdown unlike anything ever tackled on screen, nine of the world’s most powerful major language models spent five days locked in a high-stakes poker match.
OpenAI’s o3, Anthropic’s Claude Sonnet 4.5, X.ai’s Grok, Google’s Gemini 2.5 Pro, Meta’s Llama 4, DeepSeek R1, Kimi K2 from Moonshot AI, Magistral from Mistral AI, and Z.AI’s GLM 4.6 played thousands of $10 hands in Texas at $10 tables. $100,000 bankrolls apiece.
When OpenAI’s o3 model walked away from a week-long poker game $36,691 richer, there was no trophy, just bragging rights.
The experimental PokerBattle.ai was completely AI-powered with the same initial prompt issued to each player. It was pure strategy, whose strategy is what you call thousands of micro-decisions made by machines that don’t really understand winning, losing, or how humiliating it is to hit seven deuces.
For a technical stunt, it was unusually telling. The best performing AIs didn’t just bluff and bet—they adapted, modeled their opponents, and learned in real time how to navigate ambiguity. Although they did not play flawless poker, they came impressively close to imitating the judgment of experienced players.
OpenAI’s o3 quickly proved to have the most stable hand, taking down three of the five largest pots and sticking close to textbook pre-flop theory. Anthropic’s Claude and X.com’s Grok rounded out the top three with significant profits of $33,641 and $28,796 respectively.
Meanwhile, Llama lost his full stack and flamed out early. The rest of the pack landed somewhere in between, with Google’s Gemini making a modest profit and Moonshot’s Kimi K2 bleeding chips down to an $86,030 finish.
Gambling AI
Poker has long been one of the best analogs for testing general-purpose AI. Unlike chess or Go, which rely on perfect information, poker requires players to reason under uncertainty. It mirrors real-world decision-making in everything from business negotiations to military strategy and now, apparently, chatbot development.
A consistent takeaway from the tournament was that bots were often too aggressive. Most favored action-heavy strategies, even in situations where folding would have been wiser. They tried to win big pots more than they tried to avoid losing them. And they were terrible at bluffing, not because they didn’t try, but because their bluffing often stemmed from misread hands, not clever deception.
Still, AI tools are getting smarter in ways that go far beyond surface-level smarts. They are not just repeating what they have read; they make probability judgments under pressure and learn to read the room. It’s also a reminder that even powerful models still have bugs. Misreading situations, jumping to shaky conclusions and forgetting their own “position” isn’t just a poker problem.
You may never sit across from a language model in a real poker room, but odds are you’ll interact with someone trying to make decisions that matter. This game was just a glimpse of what it could look like.
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!
And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.
The best business laptops for all budgets



