Mejia, J., Harvill, M., Xue, M., & Huang, D. Techniques for Extracting Meaningful BERT Sentence Embeddings for Downstream Tasks.
|
 |
|
Abstract
|
This project implements key components of the BERT transformer model to gain a better understanding of the architecture and focuses the majority of its effort on finetuning and building on top of the base BERT model in order to extract richer sentence embeddings and succeed at multiple downstream tasks. In this project, we first implement key components of the BERT transformer model to gain a better understanding of the architecture. We then focus the majority of our effort on finetuning and building on top of the base BERT model in order to extract richer sentence embeddings and succeed at multiple downstream tasks. Our tasks of interest include sentiment analysis, paraphrase detection, and semantic textual similarity. We find that the combination of using Jaccard similarity for sentence comparison tasks, weighing the losses of the three tasks, sharing network weights across paraphrase and textual similarity tasks, and representing sentences by the average of their token embeddings gives us optimal performance on our tasks of interest. We also test with other methods that don’t improve performance across tasks and discuss the implications.
|
| Notes |
[Online; accessed 1. Jun. 2024]
|