WIKINDX

WIKINDX Resources

Kaya, Y. B., & Tantuğ, C. A. Bert2d: Two dimensional positional embeddings for efficient turkish nlp. 12, 77429–77441. 
Resource type: Journal Article
DOI: 10.1109/ACCESS.2024.3407983
ID no. (ISBN etc.): 2169-3536
BibTeX citation key: anon.87
View all bibliographic details
Categories: General
Creators: Kaya, Tantuğ
Attachments   URLs   https://doi.org/10 ... CCESS.2024.3407983
Abstract
This study addresses the challenge of improving the downstream performance of pretrained language models for morphologically rich languages, with a focus on Turkish. Traditional BERT models use one-dimensional absolute positional embeddings, which, while effective, have limitations when dealing with complex languages. We propose BERT2D, which is a novel BERT-based model that contributes to positional embedding systems. This approach introduces a dual embedding system that targets all the words and their subwords. Remarkably, this modification, coupled with whole word masking, resulted in a significant increase in performance despite a negligible increase in the parameters. Our experiments showed that BERT2D consistently outperformed the leading Turkish-focused BERT model, BERTurk, in terms of various performance metrics in text classification, token classification, and question-answering downstream tasks. For a fair comparison, we pretrained our BERT2D language model on the same dataset as that of BERTurk. The results demonstrate that two-dimensional positional embeddings can significantly improve the performance of encoder-only models in Turkish and other morphologically rich languages, suggesting a promising direction for future research in natural language processing.
  
WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)