CoSENT: Consistent Sentence Embedding via Similarity Ranking

Huang, Xiang; Peng, Hao; Zou, Dongcheng; Liu, Zhiwei; Li, Jianxin; Liu, Kay; Wu, Jia; Su, Jianlin; Yu, Philip S.

Javascript is disabled or not supported in your browser. JavaScript must be enabled in order for you to use WIKINDX fully. Enable JavaScript through your browser options then try again, otherwise, try using a different browser.

WIKINDX

WIKINDX Resources

Huang, X., Peng, H., Zou, D., Liu, Z., Li, J., & Liu, K., et al. CoSENT: Consistent Sentence Embedding via Similarity Ranking.

Resource type: Journal Article
BibTeX citation key: anon.74
View all bibliographic details

Categories: General
Keywords: Fontos!, RAG
Creators: Huang, Li, Liu, Liu, Peng, Su, Wu, Yu, Zou

Attachments

URLs https://www.semant ... tm_medium=34898166

Abstract

The failure of cosine similarity in semantic textual similarity measuring is explained, and CoSENT, a novel Consistent SENTence embedding framework is presented, designed to optimize the Siamese BERT network by exploiting ranked similarity labels of sample pairs. Learning the representation of sentences is fundamental work in the field of Natural Language Processing. Although BERT-like transformers have achieved new SOTAs for sentence embedding in many tasks, they have been proven difficult to capture semantic similarity without proper fine-tuning. A common idea to measure Semantic Textual Similarity (STS) is considering the distance between two text embeddings defined by the dot product or cosine function. However, the semantic embedding spaces induced by pretrained transformers are generally non-smooth and tend to deviate from a normal distribution, which makes traditional distance metrics imprecise. In this paper, we first empirically explain the failure of cosine similarity in semantic textual similarity measuring, and present CoSENT, a novel Consistent SENTence embedding framework. Concretely, a supervised objective function is designed to optimize the Siamese BERT network by exploiting ranked similarity labels of sample pairs. The loss function utilizes uniform cosine similarity-based optimization for both the training and prediction phases, improving the consistency of the learned semantic space. Additionally, the unified objective function can be adaptively applied to different datasets with various types of annotations and different comparison schemes of the STS tasks only by using sortable labels. Empirical evaluations on 14 common textual similarity benchmarks demonstrate that the proposed CoSENT excels in performance and reduces training time cost.

WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)