KG-Enhanced LLM Training
Integrating KGs into Training Objective
Contributors:
- Diego Collarana (FIT)
- Please add yourself if you want to contribute ...
- Please add yourself if you want to contribute ...
- Please add yourself if you want to contribute ...
- ...
Integrating KGs into LLM Inputs (verbalize KG for LLM training)
Contributors:
- Diego Collarana (FIT)
- Daniel Baldassare (doctima)
- Michael Wetzel (Coreon)
- Sabine Mahr (word b sign)
- ...
Draft from Daniel Baldassare :
- Encoding Graphs in Prompt: Talk like a graph: Encoding graphs for large language models (research.google)
- System prompt vs user prompt
- Train or finetune a Tokenizer with dedicated special tokens for graph data
- Join Graph Embeddings with text embeddings: Joint Embeddings for Graph Instruction Tuning (arxiv.org)
Integrating KGs by Fusion Modules
Contributors:
- Diego Collarana (FIT)
- Please add yourself if you want to contribute ...
- Please add yourself if you want to contribute ...
- Please add yourself if you want to contribute ...
- ...
Retrieval-Augmented Generation (RAG)
Draft Daniel Burkhardt :
- Definition of RAG
- Types of RAG
- Applications for RAG
KG-Guided Retrieval Mechanisms
Contributors:
- Daniel Burkhardt (FSTI)
- Robert David (SWC)
- Diego Collarana (FIT)
- Daniel Baldassare (doctima)
- Michael Wetzel (Coreon)
Draft Robert David:
- Initial RAG idea: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
- RAG is commonly used with vector databases.
- can only grasp semantic similarity represented in the document content
- only unstructured data
- vector distance instead of a DB search limits the retrieval capabilities
- Graph RAG uses knowledge graphs as part of the RAG system
- KGs for retrieval (directly), meaning the database is storing KG data
- KGs for retrieval via a semantic layer, potentially retrieving over different data sources of structured and unstructured data
- KGs for augmenting the retrieval, meaning the queries to some database is modified via KG data
- Via Graph RAG, we can
- ingest additional semantic background knowledge (knowledge model) not represented in the data itself
- additional related knowledge based on defined paths (rule-based inference)
- focus on certain aspects of a data set for the retrieval (search configuration)
- personalization: represent different roles for retrieval via ingesting role description data into the retrieval (especially important in an enterprise environment)
- reasoning
- linked data makes factual knowledge related to the LLM-generated knowledge and thereby provide a means to check for correctness
- explainable AI: provide justifications via KG
- consolidate different data sources: unstructured, semi-structured, structured (enterprise knowledge graph scenario)
- doing the actual retrieval via KG queries: SPARQL
- hybrid retrieval: combine KG-based retrieval with vector databases or search indexes
- ingest additional semantic background knowledge (knowledge model) not represented in the data itself
Hybrid Retrieval Combining KGs and Dense Vectors
Contributors:
- Daniel Burkhardt (FSTI)
- Diego Collarana (FIT)
- Daniel Baldassare (doctima)
- Please add yourself if you want to contribute ...
- ...
Draft from Daniel Burkhardt:
- Dense and sparse vectors (https://infiniflow.org/blog/best-hybrid-search-solution)
- Hybrid Retrieval (https://arxiv.org/html/2408.05141v1, https://haystack.deepset.ai/blog/hybrid-retrieval)
- Graph Emeddings (https://www.dfki.de/~declerck/semdeep-4/papers/SemDeep-4_paper_2.pdf)
- Re-ranking, scoring, and filtering by fusion (https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-hybrid)
- Integration of KG with dense vectors (https://github.com/InternLM/HuixiangDou)
- Benefits (enhance semantic understanding, contextual and structure insights, improve retrieval accuracy)
- Challenges (scalability, integration complexity) https://ragaboutit.com/how-to-build-a-jit-hybrid-graph-rag-with-code-tutorial/
KG-Enhanced LLM Interpretability
Draft from Daniel Burkhardt:
- Definition
- KG + LLM for Interpretability https://arxiv.org/html/2306.08302v3
- Analysis of https://github.com/zjukg/KG-LLM-Papers?tab=readme-ov-file#resources-and-benchmarking
- Overview of methods for LLM probing
- KG Alignment
- KG-guided Explanation Generation
- Factuality and Verification https://arxiv.org/abs/2404.00942
Measuring KG Alignment in LLM Representations
Draft from Daniel Burkhardt:
literature: https://arxiv.org/abs/2311.06503 , https://arxiv.org/abs/2406.03746, https://arxiv.org/abs/2402.06764
Contributors:
- Daniel Burkhardt (FSTI)
- Please add yourself if you want to contribute ...
- Please add yourself if you want to contribute ...
- ...
KG-Guided Explanation Generation
Draft from Daniel Burkhardt:
literature: https://arxiv.org/abs/2312.00353, https://arxiv.org/abs/2403.03008
Contributors:
- Daniel Burkhardt (FSTI)
- Please add yourself if you want to contribute ...
- Please add yourself if you want to contribute ...
- ...
KG-Based Fact-Checking and Verification
Contributors:
- Daniel Burkhardt (FSTI)
- Please add yourself if you want to contribute ...
- Please add yourself if you want to contribute ...
- ...
Draft from Daniel Burkhardt:
literatur: https://arxiv.org/abs/2404.00942, https://aclanthology.org/2023.acl-long.895.pdf, https://arxiv.org/pdf/2406.01311
KG-Enhanced LLM Reasoning
Draft from Daniel Burkhardt:
- Reasoning https://ieeexplore.ieee.org/abstract/document/10387715
- Domain focus https://arxiv.org/html/2404.10384v1
KG-Guided Multi-hop Reasoning
Contributors:
- Daniel Burkhardt (FSTI)
- Daniel Baldassare (doctima)
- Please add yourself if you want to contribute ...
- ...
Draft from Daniel Burkhardt:
literature: https://neo4j.com/developer-blog/knowledge-graphs-llms-multi-hop-question-answering/, https://link.springer.com/article/10.1007/s11280-021-00911-5
KG-Based Consistency Checking in LLM Outputs
Contributors:
- Daniel Burkhardt (FSTI)
- Daniel Baldassare (doctima)
- Michael Wetzel (Coreon)
- ...
Draft from Daniel Burkhardt:
KGs for LLM Analysis
Using KGs to Evaluate LLM Knowledge Coverage
Contributors:
- Daniel Burkhardt (FSTI)
- Daniel Baldassare (doctima)
- Please add yourself if you want to contribute ...
- ...
Draft from Daniel Burkhardt:
Analyzing LLM Biases through KG Comparisons
Contributors:
- Daniel Burkhardt (FSTI)
- Daniel Baldassare (doctima)
- Please add yourself if you want to contribute ...
- ...
Draft from Daniel Burkhardt:
literature: https://arxiv.org/abs/2405.04756