Prompt Compression for Large Language Models: A Survey

Li, Zongqian; Liu, Yinhong; Su, Yixuan; Collier, Nigel

Javascript is disabled or not supported in your browser. JavaScript must be enabled in order for you to use WIKINDX fully. Enable JavaScript through your browser options then try again, otherwise, try using a different browser.

WIKINDX

WIKINDX Resources

Li, Z., Liu, Y., Su, Y., & Collier, N. Prompt Compression for Large Language Models: A Survey.

Resource type: Journal Article
BibTeX citation key: anon.106
View all bibliographic details

Categories: General
Creators: Collier, Li, Liu, Su

Attachments

URLs https://www.semant ... ?email_index=0-2-2

Abstract

An overview of prompt compression techniques, categorized into hard prompt methods and soft prompt methods, and several future directions are outlined, such as optimizing the compression encoder, combining hard and soft prompts methods, and leveraging insights from multimodality. Leveraging large language models (LLMs) for complex natural language tasks typically requires long-form prompts to convey detailed requirements and information, which results in increased memory usage and inference costs. To mitigate these challenges, multiple efficient methods have been proposed, with prompt compression gaining significant research interest. This survey provides an overview of prompt compression techniques, categorized into hard prompt methods and soft prompt methods. First, the technical approaches of these methods are compared, followed by an exploration of various ways to understand their mechanisms, including the perspectives of attention optimization, Parameter-Efficient Fine-Tuning (PEFT), modality integration, and new synthetic language. We also examine the downstream adaptations of various prompt compression techniques. Finally, the limitations of current prompt compression methods are analyzed, and several future directions are outlined, such as optimizing the compression encoder, combining hard and soft prompts methods, and leveraging insights from multimodality.

Notes

[Online; accessed 24. Nov. 2024]

WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)