Utilize Pre-Trained PhoBERT to Compute Text Similarity and Rerank Documents for Question-Answering Task

Phan, D.; Le, Huong Thanh

Javascript is disabled or not supported in your browser. JavaScript must be enabled in order for you to use WIKINDX fully. Enable JavaScript through your browser options then try again, otherwise, try using a different browser.

WIKINDX

WIKINDX Resources

Phan, D., & Le, H. T. Utilize Pre-Trained PhoBERT to Compute Text Similarity and Rerank Documents for Question-Answering Task.

Resource type: Journal Article
BibTeX citation key: anon.133
View all bibliographic details

Categories: General
Creators: Le, Phan

Attachments

URLs https://www.semant ... tm_medium=27634953

Abstract

Two novel strategies to improve the performance of identifying relevant passages in open-domain Question Answering are introduced, including a new method for computing the similarity between questions and text passages, and the integration of pretrained and fine-tuned models. Open-domain Question Answering (QA) is a crucial task in natural language processing. QA systems typically follow two main steps: (i) identifying relevant passages and (ii) generating answer sentences from these passages. Among these steps, identifying relevant passages poses a greater challenge and requires further refinement. In this paper, we introduce two novel strategies to improve the performance of this step, including: (i) a new method for computing the similarity between questions and text passages, and (ii) the integration of pretrained and fine-tuned models. Empirical evaluations conducted on the Zalo 2022 dataset demonstrate the efficacy of our proposed methods, manifesting a notable 10\% increase in recall compared to using the BM25 method alone, and a 6\% increase in recall compared to relying solely on a fine-tuned cross-encoder model.

WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)