PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon

Nguyen, Andy L.

Publication:
PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon

datacite.rights	restricted
dc.contributor.advisor	Jin, Chi
dc.contributor.author	Nguyen, Andy L.
dc.date.accessioned	2025-08-12T16:25:53Z
dc.date.available	2025-08-12T16:25:53Z
dc.date.issued	2025-04-28
dc.description.abstract	We introduce Pok\'eChamp, a minimax agent powered by Large Language Models (LLMs) for Pok\'emon battles. Built on a general framework for two-player competitive games, Pok\'eChamp leverages the generalist capabilities of LLMs to enhance minimax tree search. Specifically, LLMs replace three key components: (1) player action sampling, (2) opponent modeling, and (3) value function estimation, enabling the agent to effectively utilize game play history and human knowledge to reduce the search space and address partial observability. In the second phase of our research, we develop a ReAct-like framework and incorporate retrieval-augmented generation (RAG) to evaluate the efficacy of LLMs in the specialized task of competitive team generation. Notably, our frameworks requires no additional LLM training. We evaluate Pok\'eChamp in the popular Gen 9 OU format. For battling, the battling agent achieves a win rate of 76% against the best existing LLM-based bot and 84% against the strongest rule-based bot when powered by GPT-4o, demonstrating its superior performance. Even with an open-source 8-billion-parameter Llama 3.1 model, Pok\'eChamp consistently outperforms the previous best LLM-based bot, Pok\'eLLMon powered by GPT-4o, with a 64% win rate. For the team generation task, the LLM agent was able to achieve high performing teams on par with a heuristic approach that specifically utilized statistical metagame usage data. These specialized tasks show the efficacy of LLMs trained only on generalized prior data, especially when given the same tools as current heuristic-based approaches and real human players. This work here led to publication (Karten et al., 2025)
dc.identifier.uri	https://theses-dissertations.princeton.edu/handle/88435/dsp019w0326500
dc.language.iso	en_US
dc.title	PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon
dc.type	Princeton University Senior Theses
dspace.entity.type	Publication
dspace.workflow.startDateTime	2025-04-29T03:59:14.446Z
pu.contributor.authorid	920306009
pu.date.classyear	2025
pu.department	Electrical and Computer Engineering
pu.minor	Computer Science
pu.minor	Robotics

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Nguyen_Andy.pdf
Size:: 7.8 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 100 B
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Electrical and Computer Engineering, 1932-2025

Publication: PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon

Files

Original bundle

License bundle

Collections

Publication:
PokéChamp: A Human-Expert-Level Language Agent for Competitive Pokémon