Maity, K., Chaulwar, A. T., Vala, V., & Guntur, R. S. NanoBERT: An Extremely Compact Language Model.
|
 |
|
Abstract
|
NanoBERT is introduced, a lightweight BERT model that is nearly 17 × smaller than BERT-Tiny (the smallest open source pre-trained BERT model), yet attains comparable performance on various NLP tasks such as text classification, named entity recognition, etc. Language model pre-training, such as in BERT model, has led to significant improvements in natural language processing tasks. Although many approaches such as quantization, knowledge distillation, etc. have been proposed to compress language models, they are still not suitable for deployment on resource-constrained edge devices like mobiles. In this work, we propose to replace the token embedding matrix, an expensive layer in Transformer model, with trainable rank decomposition matrices. Building upon this approach, we introduce NanoBERT, a lightweight BERT model that is nearly 17 × smaller than BERT-Tiny (the smallest open source pre-trained BERT model), yet attains comparable performance on various NLP tasks such as text classification, named entity recognition, etc. We extend this model by combining it with parameter efficient fine-tuning technique, named LoRA, for further compression in multi-task scenarios.
|