Princeton University users: to view a senior thesis while away from campus, connect to the campus network via the Global Protect virtual private network (VPN). Unaffiliated researchers: please note that requests for copies are handled manually by staff and require time to process.
 

Publication:

SpectraLDS: Distilling Spectral Filters into Constant-Time Recurrent Models

Loading...
Thumbnail Image

Files

written_final_report.pdf (1.07 MB)

Date

2025-04-10

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Access Restrictions

Abstract

We introduce the first provable method for learning a symmetric linear dynamical system of arbitrarily high effective memory. This allows us to distill the convolutional layers in a leading hybrid state space model, FlashSTU, into O(1) linear dynamical systems, merging Transformer and RNN architectures in a manner suitable for scaling and with application to language modeling and other sequential processing tasks.

Description

Keywords

Citation