Hazan, EladFortgang, Shlomo T.2025-08-062025-08-062025-04-10https://theses-dissertations.princeton.edu/handle/88435/dsp01z890rx705We introduce the first provable method for learning a symmetric linear dynamical system of arbitrarily high effective memory. This allows us to distill the convolutional layers in a leading hybrid state space model, FlashSTU, into O(1) linear dynamical systems, merging Transformer and RNN architectures in a manner suitable for scaling and with application to language modeling and other sequential processing tasks.en-USSpectraLDS: Distilling Spectral Filters into Constant-Time Recurrent ModelsPrinceton University Senior Theses