PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon

Nguyen, Andy L.

Publication:
PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon

Files

Nguyen_Andy.pdf (7.8 MB)

Date

2025-04-28

Authors

Nguyen, Andy L.

Abstract

We introduce Pok'eChamp, a minimax agent powered by Large Language Models (LLMs) for Pok'emon battles. Built on a general framework for two-player competitive games, Pok'eChamp leverages the generalist capabilities of LLMs to enhance minimax tree search. Specifically, LLMs replace three key components: (1) player action sampling, (2) opponent modeling, and (3) value function estimation, enabling the agent to effectively utilize game play history and human knowledge to reduce the search space and address partial observability. In the second phase of our research, we develop a ReAct-like framework and incorporate retrieval-augmented generation (RAG) to evaluate the efficacy of LLMs in the specialized task of competitive team generation. Notably, our frameworks requires no additional LLM training. We evaluate Pok'eChamp in the popular Gen 9 OU format. For battling, the battling agent achieves a win rate of 76% against the best existing LLM-based bot and 84% against the strongest rule-based bot when powered by GPT-4o, demonstrating its superior performance. Even with an open-source 8-billion-parameter Llama 3.1 model, Pok'eChamp consistently outperforms the previous best LLM-based bot, Pok'eLLMon powered by GPT-4o, with a 64% win rate. For the team generation task, the LLM agent was able to achieve high performing teams on par with a heuristic approach that specifically utilized statistical metagame usage data. These specialized tasks show the efficacy of LLMs trained only on generalized prior data, especially when given the same tools as current heuristic-based approaches and real human players. This work here led to publication (Karten et al., 2025)

URI

https://theses-dissertations.princeton.edu/handle/88435/dsp019w0326500

Collections

Electrical and Computer Engineering, 1932-2025

Full item page

Publication:
PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication: PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication:
PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon