Publication: Uncertainty-Aware Transformers: Conformal Prediction for LLMs
Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This study extends the CONFINE algorithm as a framework for uncertainty quantification onto transformer-based language models. CONFIDE (CONformal prediction for FIne-tuned DEep language models) applies conformal prediction to the internal embeddings of BERT and RoBERTa architectures, introducing new hyperparameters such as distance metrics and PCA. CONFIDE uses either [CLS] token embeddings or flattened hidden states to construct class-conditional nonconformity scores, enabling statistically valid prediction sets with instance-level explanations.
Empirically, CONFIDE improves test accuracy by up to 4.09% on BERT-TINY and achieves greater correct efficiency compared to prior methods, including NM2 and VanillaNN. We show that early and intermediate transformer layers often yield better-calibrated and more semantically meaningful representations for conformal prediction. In resource-constrained models and high-stakes tasks with ambiguous labels, CONFIDE offers robustness and interpretability where softmax-based uncertainty fails.