Princeton University users: to view a senior thesis while away from campus, connect to the campus network via the Global Protect virtual private network (VPN). Unaffiliated researchers: please note that requests for copies are handled manually by staff and require time to process.
 

Publication:

Compact, Fast, and Low-Energy Language Modeling with Differentiable Logic Network Transformers

datacite.rightsrestricted
dc.contributor.advisorJha, Niraj Kumar
dc.contributor.authorWarren, Conor
dc.date.accessioned2025-08-12T13:19:03Z
dc.date.available2025-08-12T13:19:03Z
dc.date.issued2025-04-14
dc.description.abstractDeep learning has experienced widespread adoption across various disciplines and applications because of its versatile problem-solving capabilities. Such versatility arises from the diverse set of deep learning architectures that have proposed and optimized for different settings. The transformer is one such deep learning architecture: especially effective at learning the long-term relationships that characterize natural language, it has achieved state-of-the-art performance on language-related tasks. Its aptitude, however, is scale-dependent, and the scale required to achieve such striking performance leads to three significant inefficiencies in transformer-based language models: large memory footprints, high inference latencies, and high energy consumption – all of which render the deployment of transformers prohibitively expensive in general and entirely infeasible in resource-constrained environments. The recent introduction of efficient and performant differentiable logic networks (DLNs) as an alternative to standard neural networks may help alleviate these limitations when other techniques like pruning, quantization, parameter-efficient finetuning, knowledge distillation, and architecture modification fall short. The present work explores this possibility, replacing the feedforward neural networks of a pretrained transformer model with highly efficient DLNs to produce DLN-transformers (DLN-Ts). The DLN-Ts we synthesize here demonstrate similar performance to the baseline transformer model on the GLUE benchmark, with inferred improvements in memory use, inference latency, and energy consumption. The DLN-T, therefore, may be a viable precursor to a compact, fast, and low-energy language model.
dc.identifier.urihttps://theses-dissertations.princeton.edu/handle/88435/dsp011n79h7759
dc.language.isoen_US
dc.titleCompact, Fast, and Low-Energy Language Modeling with Differentiable Logic Network Transformers
dc.typePrinceton University Senior Theses
dspace.entity.typePublication
dspace.workflow.startDateTime2025-04-15T02:29:02.284Z
dspace.workflow.startDateTime2025-04-28T15:22:03.863Z
pu.contributor.authorid920305832
pu.date.classyear2025
pu.departmentElectrical & Computer Engr

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Warren_Conor.pdf
Size:
1.51 MB
Format:
Adobe Portable Document Format
Download

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
100 B
Format:
Item-specific license agreed to upon submission
Description:
Download