WIKINDX

WIKINDX Resources  

Cs224n, S., Project, D., Ramos, S., & Dalmia, Y. Multi-Task Learning for Robust Contextualized Sentence Embedding Generation. 
Resource type: Journal Article
BibTeX citation key: anon.47
View all bibliographic details
Categories: General
Creators: Cs224n, Dalmia, Project, Ramos
Attachments   URLs   https://www.semant ... f140061e4ef5a9e690
Abstract
This project generated baseline results for a sentence classification task using pre-trained weights loaded into a BERT architecture, and examined different strategies for fine-tuning and adjusting contextual BERT embeddings to simultaneously perform well on multiple sentence-level tasks, including Sentiment Analysis, Paraphrase Detection, and Semantic Textual Similarity. Robust contextualized sentence embeddings are essential for natural language processing (NLP) tasks because they enable machines to analyze and process human language in a mathematical format. However, pre-trained language models that generate embeddings optimized for specific language tasks in isolation tend to yield weaker results when applied to multiple tasks simultaneously. Multi-task learning aims to improve performance on multiple tasks by learning a shared representation that can generalize to new tasks, but it is challenging because the model needs to balance the objectives of multiple tasks and avoid interference between them. In this project, we generated baseline results for a sentence classification task on the Stanford Sentiment Treebank (SST) and CFIMDB dataset using pre-trained weights loaded into a BERT architecture. We then examined different strategies for fine-tuning and adjusting contextual BERT embeddings to simultaneously perform well on multiple sentence-level tasks, including Sentiment Analysis, Paraphrase Detection, and Semantic Textual Similarity. After evaluating several techniques inspired by recent research papers for fine-tuning and extending the BERT model, we introduced cosine-similarity fine-tuning, an additive loss function (of equally weighted task-specific losses) for multi-task fine-tuning, and gradient surgery. In addition, we experimented with custom prediction heads to better capture the learnable nuances of each task. These contributions resulted in a significant improvement over baseline, yielding competitive leaderboard scores (top 30\% of submissions as of March 17th) of 0.501, 0.778, and 0.662 on the SST, Paraphrase Detection, and STS tasks, respectively.
  
Notes
[Online; accessed 1. Jun. 2024]
  
WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)