Page History
...
The training of large language models typically employs unsupervised methods on extensive datasets. Despite their impressive performance on various tasks, these models often lack the practical, real-world knowledge required for specific applications. Furthermore, since domain-specific data is not included in the public domain datasets used for pre-training or fine-tuning large language models (LLMs), integrating knowledge graphs (KGs) becomes become fundamental for injecting proprietary knowledge into LLMs, especially for enterprise solutions. To infuse this knowledge into LLMs during training, many techniques have been researched in recent years, resulting in three main state-of-the-art methods (Pan et al., 2024):
- Integration of KGs into training objectives (See answer 1)
- Verbalization of KGs into LLM inputs (See answer 2)
- Integrate Integrate KGs by Fusion Modules: Joint training of graph and language models (See answer 3)
...
The first method focuses on extending the pre-training procedure. The term pretraining pre-training objectives describes the techniques that guide the learning process of a model from its training data. In the context of pre-training large language models, various methods have been employed based on the model's architecture. Decoder-only models such as GPT-4 usually use Causal Language Modelling (CLM), where the model is presented with a sequence of tokens and learns to predict the next token in the sequence based solely on the preceding tokens (Wang et al., 2022). Integrating KGs into training objectives involves extending the standard LLM's pre-training objective of generating coherent and contextually relevant text by designing a knowledge-aware pre-training.
...