Sproul, D., & Lee, H. Pretraining, Feature Engineering, and More: Fine-tuning BERT for Multitask Capabilities.
|
 |
|
Abstract
|
These extensions include incorporating a cosine embedding loss for STS, additional pre-training of Stanford Sentiment Treebank data with a Masked Language Modeling (MLM) objective, and adding Part-of-Speech (POS) tags to the word embeddings to improve the multitasking ability of the BERT model. The development of large pre-trained language models, such as BERT, has led to significant advancements in Natural Language Processing (NLP) tasks such as Sentiment Analysis (SA), Paraphrase Detection (PD), and Semantic Textual Similarity (STS). However, these tasks are often addressed independently, resulting in models that are task-specific and not efficient in multitasking scenarios. In this project, we propose three extensions to the BERT model to create a multitask model that can handle the tasks of SA, PD, and STS simultaneously. These extensions include incorporating a cosine embedding loss for STS, additional pre-training of Stanford Sentiment Treebank (SST) data with a Masked Language Modeling (MLM) objective, and adding Part-of-Speech (POS) tags to the word embeddings. Our experimental results demonstrate that the proposed extensions slightly improve the multitasking ability of the BERT model, leading to improved performance on SA, PD, and STS tasks.
|
| Notes |
[Online; accessed 1. Jun. 2024]
|