Jeon, H. Extending BERT for General Task Applicability.
|
 |
|
Abstract
|
It is suggested that fine-tuning the model with a combination of previously proposed methods like “gradient surgery” with small changes to the output of the model can raise the model performance to a sufficient level for multiple tasks. While Bidirectional Encoder Representation from Transformers (BERT) provides a widely applicable baseline for natural language processing tasks, there still remains room for improvement in performance on various downstream tasks through further fine-tuning and extension. Specifically, while it is possible to achieve good performance by fine-tuning the model on a specific task, it may be harder to fine-tune the weights onto multiple tasks simultaneously in a way that the updated weights would be generally applicable to more than one tasks. In this project, we have explored a diverse set of approaches to extend an implementation of BERT to achieve better performance in sentiment analysis, paraphrase detection, and semantic textual similarity (STS) evaluation at once. We suggest that fine-tuning the model with a combination of previously proposed methods like “gradient surgery” with small changes to the output of the model can raise the model performance to a sufficient level for multiple tasks.
|
| Notes |
[Online; accessed 1. Jun. 2024]
|