4.6 Knowledge Base Q&A Logic (RAG)

Pop’s knowledge‑base answering capability is built on RAG (Retrieval‑Augmented Generation).
With RAG, when answering a user question, the AI does not rely solely on its internal knowledge but also uses your uploaded documents, enabling:

More accurate and controllable answers
Fully traceable content
No more “hallucinating” unsupported statements

This chapter explains in detail how Pop retrieves relevant segments from your knowledge base and uses them in AI responses.

🧠 1. What Is RAG?

In simple terms:

RAG = Retrieval + LLM Generation

Flow:

User Query → Embedding → Retrieve Document Chunks → Select Relevant Snippets → LLM → Answer

RAG provides:

No need for the LLM to memorize all content
Your knowledge base can be updated in real time
All answers come from verifiable documentation

🔎 2. Pop’s Three‑Stage RAG Pipeline

Pop performs RAG in three major stages:

Stage 1: Retrieval
Stage 2: Ranking
Stage 3: Generation

📍 Stage 1: Retrieval

Pop retrieves content using two methods:

Vector Search (KNN)
BM25 Text Search

Both methods return their Top‑N (usually 10–30) candidate chunks.

Illustration:

Query → Embedding → Vector DB → Top K
Query → Tokenization → BM25 → Top K

Using both methods increases recall coverage and avoids missing important information.

📚 Stage 2: Re‑ranking

After retrieval, Pop performs a sequence of ranking steps to ensure that only the best content is passed to the LLM.

Ranking Steps:

Deduplication (remove identical or similar chunks)
Similarity weighting
Fusion of BM25 + Embedding ranking signals
Document‑coherence ordering (chunks from the same document appear consecutively)
Chunk length control (handle overly short or long segments)

Finally, 3–6 top‑ranked chunks are selected as “evidence blocks”.

✨ Stage 3: Generation

Selected chunks are inserted into a structured prompt, including:

User question
Original document excerpts
Citation indicators
Answer constraints

Example prompt:

You are a professional assistant. Answer the user's question based on the documents below.
If the documents do not contain the answer, clearly respond: “No relevant information found in the documents.”

【Chunk 1】
...
【Chunk 2】
...
【Chunk 3】
...

Question: {User_Query}

The model strictly references the provided content and avoids fabrication.

📌 3. Pop’s RAG Features

1. Traceability

Every answer can be traced to:

Document source
Specific paragraph
Chunk ID

2. Multi‑chunk merging

Pop automatically:

Merges semantically related chunks
Removes redundancy
Restores missing context

3. Enhanced processing of unstructured text

For PDFs, webpages, or complex structures, Pop preserves:

Heading hierarchy
Page numbers
Semantic context

4. Anti‑hallucination rules

Pop’s RAG template includes strict constraints:

“Do not answer if information is missing”
“Respond only based on the provided documents”

📊 4. Example of RAG in Action

User question:

What is Pop’s knowledge‑base storage structure?

Retrieved chunks:

Chunk 17: Overview of knowledge‑base structure
Chunk 18: Explanation of BM25 and vector indexing

LLM answer:

Pop’s knowledge‑base structure consists of documents, document chunks, and dual‑indexing (BM25 + vector)…
(omitted)

All references are traceable.

💡 5. When Is RAG Useful?

RAG is ideal for:

Product documentation Q&A
API reference lookup
Customer‑service FAQ bots
Enterprise internal knowledge bases
Policy, regulation, or legal text interpretation
Technical documentation and engineering manuals

Essentially, anything that requires answering questions based on real documents.

⚠️ 6. Limitations of RAG

Despite its power, RAG has limitations:

Poor‑quality documents → poor chunking
Image‑dominant files → need OCR
Overly abstract questions → no direct answer
Weak LLM models → limited reasoning and synthesis

Hence:

High‑quality source documents = high‑quality RAG answers.

✅ Summary

RAG is Pop’s core capability for knowledge‑base Q&A:

Stage	Responsibility
Retrieval	Find potentially relevant content
Ranking	Select the most relevant, useful chunks
Generation	Produce grounded answers using retrieved chunks

RAG enables Pop to deliver answers that are:

Precise
Reliable
Traceable
Business‑aligned