Optimizing for Interpretable Phutball Policies

Sixkiller, Kalen S.

Publication:
Optimizing for Interpretable Phutball Policies

Files

498sixkiller-final.pdf (748.16 KB)

Date

2025-04-17

Authors

Sixkiller, Kalen S.

Abstract

Phutball is an impartial, rules-light game with an arbitrarily scalable board, making it an appealing testbed for human-inspectable multi-step reasoning. This thesis introduces PhutballEnv, a turn-based Markov game environment that is fully compatible with AlphaZero-style self-play. The system ships with a logging and visual-diagnostic stack that records every board position and action, while simultaneously producing gradient-based saliency maps that highlight the board features driving each decision. These rich traces can be automatically exported as text corpora, enabling language models to be fine-tuned on plain moves, saliency-tagged positions, or synthetic rationales generated post hoc.

    The document also describes a lightweight evaluation protocol that uses relative-Elo ladders against frozen checkpoints, along with a small user study assessing explanation clarity. Because full training was beyond the project’s time budget, the emphasis is on providing reliable implementations of the environment and interface, a data pipeline, and validation utilities that future work can build on at the intersection of reinforcement learning, language modeling, and interpretability.

URI

https://theses-dissertations.princeton.edu/handle/88435/dsp018w32r907s

Collections

Electrical and Computer Engineering, 1932-2025

Full item page

Publication:
Optimizing for Interpretable Phutball Policies

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication: Optimizing for Interpretable Phutball Policies

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

Description

Keywords

Citation

URI

Collections

Publication:
Optimizing for Interpretable Phutball Policies