Giarelis, N., & Karacapilidis, N. LMRank: Utilizing Pre-Trained Language Models and Dependency Parsing for Keyphrase Extraction.
|
 |
|
Abstract
|
A novel approach that utilizes dependency parsing and the sentence embeddings of pre-trained language models to improve the accuracy of the keyphrase extraction task is presented, which showcases that it scales far better than similar ones in terms of execution time. Keyphrase extraction is a Natural Language Processing task pertaining to the automatic extraction of salient terms that semantically encapsulate the major theme and topics of a document. In this article, we present LMRank, a novel approach that utilizes dependency parsing and the sentence embeddings of pre-trained language models to improve the accuracy of the keyphrase extraction task. In addition, we conduct a benchmark analysis of our approach, which showcases that it scales far better than similar ones in terms of execution time. The contribution of this work is threefold: (i) we propose a novel approach that significantly outperforms the state-of-the-art keyphrase extraction approaches in terms of time performance and accuracy in selected datasets; (ii) we provide a comparative evaluation of our approach against previous ones, by utilizing broadly used datasets in the literature and established evaluation metrics (e.g., the F1 and pF1 scores); (iii) we make the datasets and code used in our experiments public, aiming to further increase the reproducibility of this work and facilitate future research in the field.
|