Fine-Tuning LLaMA for Multi-Stage Text Retrieval

Ma, Xueguang; Wang, Liang; Yang, Nan; Wei, Furu; Lin, Jimmy

Javascript is disabled or not supported in your browser. JavaScript must be enabled in order for you to use WIKINDX fully. Enable JavaScript through your browser options then try again, otherwise, try using a different browser.

WIKINDX

WIKINDX Resources

Ma, X., Wang, L., Yang, N., Wei, F., & Lin, J. Fine-tuning llama for multi-stage text retrieval.

Resource type: Journal Article
BibTeX citation key: anon.110
View all bibliographic details

Categories: General
Creators: Lin, Ma, Wang, Wei, Yang

Attachments

URLs https://www.semant ... 8c15d5bf30bfb90c78

Abstract

The findings demonstrate that the effectiveness of large language models indeed surpasses that of smaller models, and since LLMs can inherently handle longer contexts, they can represent entire documents holistically, obviating the need for traditional segmenting and pooling strategies. The effectiveness of multi-stage text retrieval has been solidly demonstrated since before the era of pre-trained language models. However, most existing studies utilize models that predate recent advances in large language models (LLMs). This study seeks to explore potential improvements that state-of-the-art LLMs can bring. We conduct a comprehensive study, fine-tuning the latest LLaMA model both as a dense retriever (RepLLaMA) and as a pointwise reranker (RankLLaMA) for both passage retrieval and document retrieval using the MS MARCO datasets. Our findings demonstrate that the effectiveness of large language models indeed surpasses that of smaller models. Additionally, since LLMs can inherently handle longer contexts, they can represent entire documents holistically, obviating the need for traditional segmenting and pooling strategies. Furthermore, evaluations on BEIR demonstrate that our RepLLaMA-RankLLaMA pipeline exhibits strong zero-shot effectiveness. Model checkpoints from this study are available on HuggingFace.

WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)