Weitere Themen
KI-Interessierte auf DIN.ONE

Page History

Versions Compared

Key

This line was added.
This line was removed.
Formatting was changed.

ADD NEW TOP LEVEL SECTION: LLM TRAINING

How do I enhance/augment/extend LLM training through KGs? (LLM TRAINING) – length: up to one page

Lead: Daniel Baldassare

Answer 1: integrate KGs into the LLM

KG-Enhanced LLM Training

...

Training Objective

Contributors:

Diego Collarana (FIT)
Please add yourself if you want to contribute ...
Please add yourself if you want to contribute ...
Please add yourself if you want to contribute ...
...

Short definition/description of this topic: please fill in ...

Content ...
Content ...
Content ...

...

Answer 2: integrate KGs into LLM Inputs (verbalize KG for LLM training)

Contributors:

Diego Collarana (FIT)
Daniel Baldassare (doctima) – Lead
Michael Wetzel (Coreon)Sabine
Mahr (word b signRene Pietzsch (ECC)
...

Draft from Daniel Baldassare :

...

Mark boundaries of graph data using special tokens, like already for SQL-Queries: Improving Generalization in Language Model-Based Text-to-SQL
Semantic Parsing: Two Simple Semantic Boundary-Based Techniques
Encoding strategies for nodes, relationship between nodes, nodes communities and metadata Talk like a graph: Encoding graphs for large language models (research.google)
What needs to be verbalized and where? System prompt for static information like KG-schema, user prompt for data instances

...

Answer 3: Integrate KGs by Fusion Modules

Contributors:

Diego Collarana (FIT)
Please add yourself if you want to contribute ...
Please add yourself if you want to contribute ...
Please add yourself if you want to contribute ...
...

Short definition/description of this topic: please fill in ...

Content ...
Content ...
Content ...

ADD NEW TOP LEVEL SECTION: ENHANCING LLMs AT INFERENCE TIME

How do I use KGs for Retrieval-Augmented Generation (RAG)? (2.1 – Prompt Enhancement)– length: up to one page

Lead: Diego

Draft Daniel Burkhardt :

Short definition/description of this topic: Retrieval-Augmented Generation (RAG) is a method that combines retrieval mechanisms with generative models to enhance the output of language models by incorporating external knowledge. This approach retrieves relevant information from a database or corpus and uses it to inform the generation process, leading to more accurate and contextually relevant outputs.

Definition of RAG
Types of RAG
- Standard RAG: Utilizes vector databases to retrieve documents based on semantic similarity, which are then used to augment the generative process of language models.
- Graph RAG: Integrates knowledge graphs into the RAG framework, allowing for the retrieval of structured data that can provide additional context and factual accuracy to the generative model
Applications for RAG
- RAG is used in various natural language processing tasks, including question answering, information extraction, sentiment analysis, and summarization. It is particularly beneficial in scenarios requiring domain-specific knowledge, as it reduces the tendency of language models to generate hallucinated or incorrect information by grounding responses in retrieved facts.

Answer 1: KG-Guided Retrieval Mechanisms

Contributors:

Daniel Burkhardt (FSTI)
Robert David (SWC)
Diego Collarana (FIT)
Daniel Baldassare (doctima)
Michael Wetzel (Coreon)

...

Initial RAG idea: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
RAG is commonly used with vector databases.
- can only grasp semantic similarity represented in the document content
- only unstructured data
- vector distance instead of a DB search limits the retrieval capabilities
Graph RAG uses knowledge graphs as part of the RAG system
- KGs for retrieval (directly), meaning the database is storing KG data
- KGs for retrieval via a semantic layer, potentially retrieving over different data sources of structured and unstructured data
- KGs for augmenting the retrieval, meaning the queries to some database is modified via KG data
Via Graph RAG, we can
- ingest additional semantic background knowledge (knowledge model) not represented in the data itself
  - additional related knowledge based on defined paths (rule-based inference)
  - focus on certain aspects of a data set for the retrieval (search configuration)
  - personalization: represent different roles for retrieval via ingesting role description data into the retrieval (especially important in an enterprise environment)
- reasoning
- linked data makes factual knowledge related to the LLM-generated knowledge and thereby provide a means to check for correctness
- explainable AI: provide justifications via KG
- consolidate different data sources: unstructured, semi-structured, structured (enterprise knowledge graph scenario)
- doing the actual retrieval via KG queries: SPARQL
- hybrid retrieval: combine KG-based retrieval with vector databases or search indexes

Answer 2: Hybrid Retrieval Combining KGs and Dense Vectors

Contributors:

Daniel Burkhardt (FSTI)
Diego Collarana (FIT)
Daniel Baldassare (doctima)
Please add yourself if you want to contribute ...
...

Draft from Daniel Burkhardt:

...

Dense and sparse vectors (https://infiniflow.org/blog/best-hybrid-search-solution, https://aclanthology.org/2023.findings-acl.679.pdf)
Hybrid Retrieval (https://arxiv.org/html/2408.05141v1, https://haystack.deepset.ai/blog/hybrid-retrieval, https://arxiv.org/pdf/1905.07129)
Graph Emeddings (https://www.dfki.de/~declerck/semdeep-4/papers/SemDeep-4_paper_2.pdf, https://arxiv.org/pdf/1711.11231)
Re-ranking, scoring, and filtering by fusion (https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-hybrid, https://arxiv.org/pdf/2004.12832, https://arxiv.org/pdf/2009.07258)
Integration of KG with dense vectors (https://github.com/InternLM/HuixiangDou)
Benefits (enhance semantic understanding, contextual and structure insights, improve retrieval accuracy)
Challenges (scalability, integration complexity) https://ragaboutit.com/how-to-build-a-jit-hybrid-graph-rag-with-code-tutorial/

...

How do I enhance LLM explainability by using KGs? (2.2 – Answer Verification) – length: up to one page

Lead: Daniel Burkhardt

Draft from Daniel Burkhardt:

Short definition/description of this topic: KG-Enhanced LLM Interpretability refers to the use of knowledge graphs to improve the transparency and explainability of large LLMs. By integrating structured knowledge from KGs, LLMs can generate more interpretable outputs, providing justifications and factual accuracy checks for their responses. This integration helps in aligning LLM-generated knowledge with factual data, enhancing trust and reliability.

Definition
KG + LLM for Interpretability https://arxiv.org/html/2306.08302v3
Analysis of https://github.com/zjukg/KG-LLM-Papers?tab=readme-ov-file#resources-and-benchmarking
Overview of methods for LLM probing https://ar5iv.labs.arxiv.org/html/2309.01029
- KG Alignment
- KG-guided Explanation Generation
- Factuality and Verification https://arxiv.org/abs/2404.00942

Answer 1: Measuring KG Alignment in LLM Representations

Draft from Daniel Burkhardt:

...

Contributors:

Daniel Burkhardt (FSTI)
Please add yourself if you want to contribute ...
Please add yourself if you want to contribute ..
Content ...
Content ...
Content ...

Answer 2: KG-Guided Explanation Generation

Draft from Daniel Burkhardt:

...

Contributors:

Daniel Burkhardt (FSTI)
Please add yourself if you want to contribute ...
Rene Pietzsch (ECC)
Please add yourself if you want to contribute ......
Content ...
Content ...
Content ...

Answer 3: KG-Based Fact-Checking and Verification

Contributors:

Daniel Burkhardt (FSTI)
Please add yourself if you want to contribute ...
Please add yourself if you want to contribute ...
...

Draft from Daniel Burkhardt:

...

Content ...
Content ...
Content ...

...

How do I enhance LLM reasoning through KGs? (2.3 – Answer Augmentation) – length: up to one page

Lead: Daniel Burkhardt

Draft from Daniel Burkhardt:

...

Content ...
Content ...
Content ...

Answer 1: KG-Guided Multi-hop Reasoning

Contributors:

Daniel Burkhardt (FSTI)
Daniel Baldassare (doctima)
Please add yourself if you want to contribute ...
...

...

Content ...
Content ...
Content ...

Answer 2: KG-Based Consistency Checking in LLM Outputs

Contributors:

Daniel Burkhardt (FSTI)
Daniel Baldassare (doctima)
Michael Wetzel (Coreon)
...

...

Content ...
Content ...
Content ...

KGs for LLM Analysis

How do I evaluate LLMs through KGs? (3) – length: up to one page

Answer 1: Using KGs to Evaluate LLM Knowledge Coverage

Maybe add additional properties such as factuality, correctness, precision etc. or perhaps keep these that we have right now and call them "selected properties" ...

Lead: Fabio

Contributors:

Daniel Burkhardt (FSTI)
Daniel Baldassare (doctima)
Please add yourself if you want to contribute ...Fabio Barth (DFKI)
...

Draft from Daniel Burkhardt:

...

Content ...
Content ...
Content ...

Answer 2: Analyzing LLM Biases through KG Comparisons

Contributors:

Daniel Burkhardt (FSTI)
Daniel Baldassare (doctima)
Please add yourself if you want to contribute ...Fabio Barth (DFKI)
...

Draft from Daniel Burkhardt:

...

Content

Space Tools

Versions Compared

Old Version 18

New Version 19

Key

How do I enhance/augment/extend LLM training through KGs? (LLM TRAINING) – length: up to one page

Answer 1: integrate KGs into the LLM

KG-Enhanced LLM Training

Training Objective

Answer 2: integrate KGs into LLM Inputs (verbalize KG for LLM training)

Answer 3: Integrate KGs by Fusion Modules

How do I use KGs for Retrieval-Augmented Generation (RAG)? (2.1 – Prompt Enhancement)– length: up to one page

Answer 1: KG-Guided Retrieval Mechanisms

Answer 2: Hybrid Retrieval Combining KGs and Dense Vectors

How do I enhance LLM explainability by using KGs? (2.2 – Answer Verification) – length: up to one page

Answer 1: Measuring KG Alignment in LLM Representations

Answer 2: KG-Guided Explanation Generation

Answer 3: KG-Based Fact-Checking and Verification

How do I enhance LLM reasoning through KGs? (2.3 – Answer Augmentation) – length: up to one page

Answer 1: KG-Guided Multi-hop Reasoning

Answer 2: KG-Based Consistency Checking in LLM Outputs

KGs for LLM Analysis

How do I evaluate LLMs through KGs? (3) – length: up to one page

Answer 1: Using KGs to Evaluate LLM Knowledge Coverage

Answer 2: Analyzing LLM Biases through KG Comparisons