Fine Tuning Auto Regressive LLMs for Long Document Abstractive Summarization

Rath, M.; Banerjee, Shobhan; Swain, Tanmaya

Javascript is disabled or not supported in your browser. JavaScript must be enabled in order for you to use WIKINDX fully. Enable JavaScript through your browser options then try again, otherwise, try using a different browser.

WIKINDX

WIKINDX Resources

Rath, M., Banerjee, S., & Swain, T. Fine Tuning Auto Regressive LLMs for Long Document Abstractive Summarization.

Resource type: Journal Article
BibTeX citation key: anon.139
View all bibliographic details

Categories: General
Creators: Banerjee, Rath, Swain

Attachments

URLs https://www.semant ... 8b84d547b8014d33e5

Abstract

Cerebras’ wafer-scale cluster is used to provide an efficient software and hardware infrastructure that enhances the capabilities of pre-existing models and empowers them to handle lengthy documents well and to accommodate lengthy documents as much as possible. Generating a short summary from a long document is a challenging task, for which new language models are still being designed and trained based on the available data. Since deep learning models are used for NLP and NLG applications, it requires high computational power to train these models. Further, fine-tuning the weights based on a given context is an important task that needs additional computation space and time. In this paper, we have used Cerebras’ wafer-scale cluster that aims at providing an efficient software and hardware infrastructure that enhances the capabilities of pre-existing models and empowers them to handle lengthy documents well. In addition to analyzing common models along with their pros and cons, other factors such as the context lengths and model sizes have also been analyzed to accommodate lengthy documents as much as possible.

WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)