knowledge-base/Knowledge Base Q&A Logic (RAG)

4.6 Knowledge Base Q&A Logic (RAG)

Pop’s knowledge‑base answering capability is built on RAG (Retrieval‑Augmented Generation).
With RAG, when answering a user question, the AI does not rely solely on its internal knowledge but also uses your uploaded documents, enabling:

  • More accurate and controllable answers
  • Fully traceable content
  • No more “hallucinating” unsupported statements

This chapter explains in detail how Pop retrieves relevant segments from your knowledge base and uses them in AI responses.


🧠 1. What Is RAG?

In simple terms:

RAG = Retrieval + LLM Generation

Flow:

User Query → Embedding → Retrieve Document Chunks → Select Relevant Snippets → LLM → Answer

RAG provides:

  • No need for the LLM to memorize all content
  • Your knowledge base can be updated in real time
  • All answers come from verifiable documentation

🔎 2. Pop’s Three‑Stage RAG Pipeline

Pop performs RAG in three major stages:

Stage 1: Retrieval
Stage 2: Ranking
Stage 3: Generation

📍 Stage 1: Retrieval

Pop retrieves content using two methods:

  1. Vector Search (KNN)
  2. BM25 Text Search

Both methods return their Top‑N (usually 10–30) candidate chunks.

Illustration:

Query → Embedding → Vector DB → Top K
Query → Tokenization → BM25 → Top K

Using both methods increases recall coverage and avoids missing important information.


📚 Stage 2: Re‑ranking

After retrieval, Pop performs a sequence of ranking steps to ensure that only the best content is passed to the LLM.

Ranking Steps:

  1. Deduplication (remove identical or similar chunks)
  2. Similarity weighting
  3. Fusion of BM25 + Embedding ranking signals
  4. Document‑coherence ordering (chunks from the same document appear consecutively)
  5. Chunk length control (handle overly short or long segments)

Finally, 3–6 top‑ranked chunks are selected as “evidence blocks”.


✨ Stage 3: Generation

Selected chunks are inserted into a structured prompt, including:

  • User question
  • Original document excerpts
  • Citation indicators
  • Answer constraints

Example prompt:

You are a professional assistant. Answer the user's question based on the documents below.
If the documents do not contain the answer, clearly respond: “No relevant information found in the documents.”

【Chunk 1】
...
【Chunk 2】
...
【Chunk 3】
...

Question: {User_Query}

The model strictly references the provided content and avoids fabrication.


📌 3. Pop’s RAG Features

1. Traceability

Every answer can be traced to:

  • Document source
  • Specific paragraph
  • Chunk ID

2. Multi‑chunk merging

Pop automatically:

  • Merges semantically related chunks
  • Removes redundancy
  • Restores missing context

3. Enhanced processing of unstructured text

For PDFs, webpages, or complex structures, Pop preserves:

  • Heading hierarchy
  • Page numbers
  • Semantic context

4. Anti‑hallucination rules

Pop’s RAG template includes strict constraints:

  • “Do not answer if information is missing”
  • “Respond only based on the provided documents”

📊 4. Example of RAG in Action

User question:

What is Pop’s knowledge‑base storage structure?

Retrieved chunks:

  • Chunk 17: Overview of knowledge‑base structure
  • Chunk 18: Explanation of BM25 and vector indexing

LLM answer:

Pop’s knowledge‑base structure consists of documents, document chunks, and dual‑indexing (BM25 + vector)…
(omitted)

All references are traceable.


💡 5. When Is RAG Useful?

RAG is ideal for:

  • Product documentation Q&A
  • API reference lookup
  • Customer‑service FAQ bots
  • Enterprise internal knowledge bases
  • Policy, regulation, or legal text interpretation
  • Technical documentation and engineering manuals

Essentially, anything that requires answering questions based on real documents.


⚠️ 6. Limitations of RAG

Despite its power, RAG has limitations:

  • Poor‑quality documents → poor chunking
  • Image‑dominant files → need OCR
  • Overly abstract questions → no direct answer
  • Weak LLM models → limited reasoning and synthesis

Hence:

High‑quality source documents = high‑quality RAG answers.


✅ Summary

RAG is Pop’s core capability for knowledge‑base Q&A:

Stage Responsibility
Retrieval Find potentially relevant content
Ranking Select the most relevant, useful chunks
Generation Produce grounded answers using retrieved chunks

RAG enables Pop to deliver answers that are:

  • Precise
  • Reliable
  • Traceable
  • Business‑aligned