Tlemcani, R., & Sohn, C. CS 224N: MinBERT and Downstream Tasks.
|
 |
|
Abstract
|
This paper conducts experiments to create a multi-task model that leverages BERT embeddings for various downstream tasks, demonstrating the benefits of multitask learning and achieving good performance. Generalizing Natural Language Processing (NLP) models to multiple tasks can provide advantages including robustness, improved real-world applicability, and greater data efficiency. This is because multi-task models are capable of utilizing diverse input data types during their training, and can develop a better understanding of patterns in language [1]. The BERT language embedding model achieves high accuracy in downstream language tasks, although separate fine-tuning is necessary for each individual task.[2]. In this paper, we conduct experiments to create a multi-task model that leverages BERT embeddings for various downstream tasks, demonstrating the benefits of multitask learning and achieving good performance. We focus on sentiment classification, paraphrase detection, and semantic textual similarity tasks through exploring different model architectures, loss functions, optimizers, and hyperparameters [Sec 3]. We present our results for the three downstream tasks [Tab 4].
|
| Notes |
[Online; accessed 1. Jun. 2024]
|