Revisiting Softmax for Uncertainty Approximation in Text Classification

doi:10.3390/info14070420

Andreas Nugaard Holm, Dustin Wright, Isabelle Augenstein

Revisiting Softmax for Uncertainty Approximation in Text Classification

Information Systems

Uncertainty approximation in text classification is an important area with applications in domain adaptation and interpretability. One of the most widely used uncertainty approximation methods is Monte Carlo (MC) dropout, which is computationally expensive as it requires multiple forward passes through the model. A cheaper alternative is to simply use a softmax based on a single forward pass without dropout to estimate model uncertainty. However, prior work has indicated that these predictions tend to be overconfident. In this paper, we perform a thorough empirical analysis of these methods on five datasets with two base neural architectures in order to identify the trade-offs between the two. We compare both softmax and an efficient version of MC dropout on their uncertainty approximations and downstream text classification performance, while weighing their runtime (cost) against performance (benefit). We find that, while MC dropout produces the best uncertainty approximations, using a simple softmax leads to competitive, and in some cases better, uncertainty estimation for text classification at a much lower computational cost, suggesting that softmax can in fact be a sufficient uncertainty estimate when computational resources are a concern.

Need a simple solution for managing your BibTeX entries? Explore CiteDrive!

Web-based, modern reference management
Collaborate and share with fellow researchers
Integration with Overleaf
Comprehensive BibTeX/BibLaTeX support
Save articles and websites directly from your browser
Search for new articles from a database of tens of millions of references

Try out CiteDrive

Revisiting Softmax for Uncertainty Approximation in Text Classification

Need a simple solution for managing your BibTeX entries? Explore CiteDrive!

More from our Archive

Predicting metabolite–disease associations based on auto-encoder and non-negative matrix factorization

Machinability of Titanium Grade 5 Alloy for Wire Electrical Discharge Machining Using a Hybrid Learning Algorithm

scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning

Federated Edge Intelligence and Edge Caching Mechanisms

<i>D</i><sup>2</sup><i>PAM</i>: Epileptic seizures prediction using adversarial deep dual patch attention mechanism

VEPL Dataset: A Vegetation Encroachment in Power Line Corridors Dataset for Semantic Segmentation of Drone Aerial Orthomosaics

Datasets of Simulated Exhaled Aerosol Images from Normal and Diseased Lungs with Multi-Level Similarities for Neural Network Training/Testing and Continuous Learning

Tassel-YOLO: A New High-Precision and Real-Time Method for Maize Tassel Detection and Counting Based on UAV Aerial Images

Revisiting Softmax for Uncertainty Approximation in Text Classification

An Overview of Drone Applications in the Construction Industry