MT-BERT: Fine-tuning BERT for Downstream Tasks Using Multi-Task Learning

Cs224n, Stanford; Project, Default; Kumbong, Hermann

Javascript is disabled or not supported in your browser. JavaScript must be enabled in order for you to use WIKINDX fully. Enable JavaScript through your browser options then try again, otherwise, try using a different browser.

WIKINDX

WIKINDX Resources

Cs224n, S., Project, D., & Kumbong, H. MT-BERT: Fine-tuning BERT for Downstream Tasks Using Multi-Task Learning.

Resource type: Journal Article
BibTeX citation key: anon.45
View all bibliographic details

Categories: General
Creators: Cs224n, Kumbong, Project

Attachments

URLs https://www.semant ... 04895d8a6d4a17dbab

Abstract

This work uses four main techniques to improve the baseline BERT implementation: additional pretraining on the target domain data using Masked Language Modelling, multi-task fine-tuning using gradient surgery, single-task fine-tuning, and feature augmentation. The goal of this project is to fine-tune the contextualized embeddings from BERT to perform well simultaneously on three different downstream tasks: Sentiment Analysis (SST), Paraphrase Detection (PD), and Semantic Textual Similarity (STS). Our work uses four main techniques to improve the baseline BERT implementation: additional pretraining on the target domain data using Masked Language Modelling, multi-task fine-tuning using gradient surgery, single-task fine-tuning, and feature augmentation. We are able to achieve accuracies of 0.539 and 0.877 for SST and PD respectively, and a Pearson correlation score of 0.863 on the STS test set using a single model without ensembling.

Notes

[Online; accessed 1. Jun. 2024]

WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)