From Word Vectors to Multimodal Embeddings: Techniques, Applications, and Future Directions For Large Language Models

Zhang, Charles; Peng, Benji; Sun, Xintian; Niu, Qian; Liu, Junyu; Chen, Keyu; Li, Ming; Feng, Pohsun; Bi, Ziqian; Liu, Ming; Zhang, Yichao; Fei, Cheng; Yin, Caitlyn Heqi; Yan, Lawrence K. Q.; Wang, Tianyang

Javascript is disabled or not supported in your browser. JavaScript must be enabled in order for you to use WIKINDX fully. Enable JavaScript through your browser options then try again, otherwise, try using a different browser.

WIKINDX

WIKINDX Resources

Zhang, C., Peng, B., Sun, X., Niu, Q., Liu, J., & Chen, K., et al. From Word Vectors to Multimodal Embeddings: Techniques, Applications, and Future Directions For Large Language Models.

Resource type: Journal Article
BibTeX citation key: anon.194
View all bibliographic details

Categories: General
Creators: Bi, Chen, Fei, Feng, Li, Liu, Liu, Niu, Peng, Sun, Wang, Yan, Yin, Zhang, Zhang

Attachments

URLs https://www.semant ... ?email_index=0-0-0

Abstract

This review visits foundational concepts such as the distributional hypothesis and contextual similarity, tracing the evolution from sparse representations like one-hot encoding to dense embeddings including Word2Vec, GloVe, and fastText. Word embeddings and language models have transformed natural language processing (NLP) by facilitating the representation of linguistic elements in continuous vector spaces. This review visits foundational concepts such as the distributional hypothesis and contextual similarity, tracing the evolution from sparse representations like one-hot encoding to dense embeddings including Word2Vec, GloVe, and fastText. We examine both static and contextualized embeddings, underscoring advancements in models such as ELMo, BERT, and GPT and their adaptations for cross-lingual and personalized applications. The discussion extends to sentence and document embeddings, covering aggregation methods and generative topic models, along with the application of embeddings in multimodal domains, including vision, robotics, and cognitive science. Advanced topics such as model compression, interpretability, numerical encoding, and bias mitigation are analyzed, addressing both technical challenges and ethical implications. Additionally, we identify future research directions, emphasizing the need for scalable training techniques, enhanced interpretability, and robust grounding in non-textual modalities. By synthesizing current methodologies and emerging trends, this survey offers researchers and practitioners an in-depth resource to push the boundaries of embedding-based language models.

Notes

[Online; accessed 24. Nov. 2024]

WIKINDX 6.11.0 | Total resources: 209 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)