Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Diego Collarana (FIT)
  • Daniel Baldassare (doctima) – Lead
  • Michael Wetzel (Coreon)
  • Rene Pietzsch (ECC)
  • ... 


Description:

Verbalizing knowledge graphs for LLM is the pre-training task of representing knowledge graphs as text so that they can be written directly in the prompt, the main input source of LLM. Verbalization consists of finding textual representations for nodes, relationships between nodes, and their metadata. Verbalization can take place at different stages of the LLM lifecycle, during training (pre-training, instruction fine-tuning) or during inference (in-context learning), and consists in:

Considerations:

...

:

  • Simple concatenation of KG triples with text
  • Entity/Token alignment prediction

Considerations:

  • Simple concatenation of tokens and triples from KG can cause "knowledge noise"

Standards:

  • Prediction alignment links between tokens and entities
  • Entity embeddings + additional entity prediction task to token-only pretraining objective


Answer 2: Integrate KGs during pre-training

...