Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Daniel Burkhardt (FSTI)
  • Robert David (SWC)
  • Diego Collarana (FIT)
  • Daniel Baldassare (doctima)
  • Michael Wetzel (Coreon)

Explanation of concepts

...

Problem statement

RAG methods aim to enhance the

...

capabilities of LLMs by

...

Types of RAG:

  • Conventional RAG has three components: 1) Knowledge Base, typically created by chunking text documents, transforming them into embeddings, and storing them in a vector store. 2) Retriever searches the vector database for chunks that exhibit high similarity to the query. 3) Generator feeds the retrieved chunks, alongside the original query, to an LLM to generate the final response.
  • Graph RAG integrates knowledge graphs into the RAG framework, allowing for the retrieval of structured data that can provide additional context and factual accuracy to the generative model. 
    Basically, the retrieval can be done on any source that has a semantic representation, e.g. documents with semantic annotations or relational data via OBDA or R2RML, thereby ingesting structured and unstructured source information into the Graph RAG.

...

Conventional RAG Limitations

Despite its advantages over standalone LLMs, Conventional RAG has the following limitations:

...

providing real-time information and domain-specific knowledge that may not be present in their training data. Despite its advantages over standalone LLMs, Conventional RAG has the following limitations:

  1. Struggles to answer queries that require the intricate interconnectedness of information and global context crucial for generating comprehensive summaries.
  2. It cannot integrate structure and unstructured data, a use case typically required in industrial applications.
  3. Limited accuracy due to context loss during text chunking and its reliance on text similarity search.
  4. It has limited reasoning capabilities, especially with abstract questions that require reasoning, inference, or the synthesis of new information not explicitly stated in the source material.
  5. The answers cannot be backtracked to the information sources (factual grounding).
  6. The external knowledge, while consistent, can still lead to inconsistencies in the generated answer.

Explanation of concepts

  • Retrieval-augmented generation (RAG) methods combine retrieval mechanisms with generative models to enhance the output of LLMs by incorporating external knowledge. By grounding the generated output in specific and relevant information, RAG methods improve the quality and accuracy of the generated output.
  • Types of RAG:

    • Conventional RAG has three components: 1) Knowledge Base, typically created by chunking text documents, transforming them into embeddings, and storing them in a vector store. 2) Retriever searches the vector database for chunks that exhibit high similarity to the query. 3) Generator feeds the retrieved chunks, alongside the original query, to an LLM to generate the final response.
    • Graph RAG integrates knowledge graphs into the RAG framework, allowing for the retrieval of structured data that can provide additional context and factual accuracy to the generative model. 
      The retrieval can be done on any source with a semantic representation, e.g., documents with semantic annotations or relational data via OBDA or R2RML, thereby ingesting structured and unstructured source information into the Graph RAG.
  • RAG is used in various natural language processing tasks, including question-answering, information extraction, sentiment analysis, and summarization. It is particularly beneficial in scenarios requiring domain-specific knowledge

...

  • .

We describe various solutions for integrating knowledge graphs into RAG systems to improve accuracy, reliability, and explainability. 

...

  • Accurate Query Mapping: Requires advanced NLP techniques to accurately map natural language queries to graph queries accurately. Entity linking and relationship extraction must be precise to ensure correct query formulation.
  • Performance Efficiency: Executing complex graph queries may impact performance, especially with large-scale knowledge graphs. Optimization of graph databases and queries is necessary for real-time applications.
  • Scalability: The system should handle growing knowledge graphs without significant performance loss. Scalable graph database solutions are essential.
  • User Experience: The system must effectively interpret user intent from natural language inputs. Providing clear and concise answers enhances usability and trust.

...

  • First, the user's question is processed to extract key entities and relationships using entity linking and relationship extraction techniques as a (semantic) graph representation of the question. (Natural Language Understanding)
  • Next, the graph representation is executed against the knowledge graph database, which first retrieves information from the knowledge graph and then retrieves the associated mapped data source.
    Data sources can be of different kinds:
    • Knowledge graph data
    • Non-knowledge graph data with a graph representation:
      • Tabular and relational data, e.g., via OBDA or R2RML.
      • Semi-structured data, e.g., XML or DITA.
      • Unstructured natural language, e.g., via semantic annotations.
  • Then, the retrieved (different kinds of) results are consolidated (preprocessed) to be ingested into the LLM prompt. (Data consolidation)
  • Finally, the consolidated results are passed to the LLM for summarization or further processing to generate the final answer. (Response generation)

...