> For the complete documentation index, see [llms.txt](https://docs.clickai.vn/clickai-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clickai.vn/clickai-docs/clickai-docs-en/database/database-and-knowledge.md).

# Database & Knowledge

<figure><img src="/files/kdHp2zmPAUKe4GEJPMO9" alt=""><figcaption></figcaption></figure>

## Table of Contents

·       \[Introduction to RAG]\(#introduction-to-rag)

·       \[Practical Use Cases]\(#practical-use-cases)

·       \[Creating a Knowledge Base]\(#creating-a-knowledge-base)

·       \[Management & Optimization]\(#management--optimization)

·       \[Integrating into Applications]\(#integrating-into-applications)

·       \[Retrieval Strategies]\(#retrieval-strategies)

&#x20;

## Introduction to RAG

RAG (Retrieval-Augmented Generation) is the core architecture of the Knowledge system in ClickAI:

1\.     Retrieval — When a user asks a question, the system first retrieves the most relevant information from the incorporated knowledge.

2\.     Augmented — This retrieved information is then combined with the user's original query and sent to the LLM as augmented context.

3\.     Generation — The LLM uses this context to generate a more precise answer.

💡 TIP: RAG enables AI to answer based on your up-to-date data rather than relying solely on pre-training knowledge, significantly reducing hallucination.

&#x20;

## Practical Use Cases

<table data-header-hidden><thead><tr><th valign="top"></th><th valign="top"></th></tr></thead><tbody><tr><td valign="top">Use Case</td><td valign="top">Description</td></tr><tr><td valign="top">Customer Support Bot</td><td valign="top">Accurate answers from product docs, FAQs, troubleshooting guides</td></tr><tr><td valign="top">Internal Knowledge Portal</td><td valign="top">AI search &#x26; Q&#x26;A for employees accessing company policies and procedures</td></tr><tr><td valign="top">Content Generation</td><td valign="top">Generate reports, articles, emails based on background materials</td></tr><tr><td valign="top">Research &#x26; Analysis</td><td valign="top">Retrieve and summarize from research repositories, market reports</td></tr></tbody></table>

&#x20;

&#x20;

<figure><img src="/files/nBVTGNLMeOaUxoBrF7tn" alt=""><figcaption></figcaption></figure>

## Creating a Knowledge Base

ClickAI provides 3 ways to create a Knowledge Base:

### Method 1: Quick Create (Recommended for beginners)

The fastest way — import data, define processing rules, and ClickAI handles the rest.

Steps:

4\.     Go to Knowledge from sidebar → Click "+ Create Knowledge"

5\.     Name and describe your Knowledge Base

6\.     Upload documents — Supported formats:

<table data-header-hidden><thead><tr><th valign="top"></th><th valign="top"></th></tr></thead><tbody><tr><td valign="top">Format</td><td valign="top">Type</td></tr><tr><td valign="top">.pdf</td><td valign="top">PDF documents</td></tr><tr><td valign="top">.docx .doc</td><td valign="top">Microsoft Word</td></tr><tr><td valign="top">.txt</td><td valign="top">Plain text</td></tr><tr><td valign="top">.md</td><td valign="top">Markdown</td></tr><tr><td valign="top">.csv</td><td valign="top">Data tables</td></tr><tr><td valign="top">.html</td><td valign="top">Web pages</td></tr><tr><td valign="top">.xlsx</td><td valign="top">Excel spreadsheets</td></tr></tbody></table>

&#x20;

7\.     Configure chunking — How to split documents:

<table data-header-hidden><thead><tr><th valign="top"></th><th valign="top"></th><th valign="top"></th></tr></thead><tbody><tr><td valign="top">Method</td><td valign="top">Description</td><td valign="top">When to Use</td></tr><tr><td valign="top">Automatic</td><td valign="top">ClickAI optimally splits</td><td valign="top">Most cases</td></tr><tr><td valign="top">Custom</td><td valign="top">Custom separator, chunk size</td><td valign="top">Specially structured documents</td></tr><tr><td valign="top">Parent-Child</td><td valign="top">Large chunks containing smaller chunks</td><td valign="top">Need broader context</td></tr></tbody></table>

&#x20;

8\.     Select Embedding Model — Model to convert text into vectors

9\.     Choose Index Method:

·       High Quality: Uses Embedding Model (more accurate, uses tokens)

·       Economical: Uses keyword indexing (saves costs, less accurate)

10\.  Click "Save & Process" → Wait for indexing to complete

### Method 2: Knowledge Pipeline (Advanced)

Orchestrate more complex data processing workflows with custom steps and plugins.

·       Suitable for complex data pre-processing needs

·       Supports ETL (Extract-Transform-Load) plugins

·       Reusable workflows

### Method 3: External Knowledge Base

Connect directly to external knowledge bases via APIs.

·       Leverage existing data without migration

·       Supports vector databases: Pinecone, Weaviate, Qdrant, ...

·       Real-time synchronization

⚠️ IMPORTANT: Knowledge Base quality directly determines AI response quality. Invest time in preparing clean, well-structured documents.

&#x20;

## Management & Optimization

### Content Management

<table data-header-hidden><thead><tr><th valign="top"></th><th valign="top"></th></tr></thead><tbody><tr><td valign="top">Action</td><td valign="top">Description</td></tr><tr><td valign="top">View documents</td><td valign="top">See list of uploaded documents</td></tr><tr><td valign="top">Add documents</td><td valign="top">Upload new documents</td></tr><tr><td valign="top">Edit chunks</td><td valign="top">Edit chunk content</td></tr><tr><td valign="top">Enable/Disable</td><td valign="top">Toggle document or chunk</td></tr><tr><td valign="top">Delete</td><td valign="top">Remove unneeded documents</td></tr><tr><td valign="top">Re-index</td><td valign="top">Re-index when settings change</td></tr></tbody></table>

&#x20;

### Test & Validate Retrieval

Test retrieval quality before integrating into applications:

11\.  Open Knowledge Base → Click "Test Retrieval"

12\.  Enter a test query (simulating user questions)

13\.  Click "Run Test"

14\.  Review results:

·       Matched chunks — Retrieved document segments

·       Relevance score — Relevance rating (0-1)

·       Source document — Source document reference

💡 TIP: Test with diverse questions to ensure the Knowledge Base covers all topics. If relevance scores are low (below 0.5), consider improving document content.

### Metadata Enhancement

Add metadata to documents for improved retrieval accuracy:

\# Example Metadata\
document: "Warranty Policy 2024.pdf"\
metadata:\
&#x20; category: "policy"\
&#x20; department: "customer-service"\
&#x20; effective\_date: "2024-01-01"\
&#x20; language: "en"

Metadata enables filter-based search, e.g., "Only search within category=policy documents."

### Adjusting Settings

You can change at any time:

·       Index method — Switch between High Quality and Economical

·       Embedding model — Change to a different embedding model

·       Retrieval strategy — Semantic Search, Full-text Search, or Hybrid

&#x20;

<figure><img src="/files/s8WMfcvekEo3n3EciUwN" alt=""><figcaption></figcaption></figure>

## Integrating into Applications

### In Chatbot / Agent

15\.  Open the app in Studio

16\.  In the Context section, click "+ Add"

17\.  Select the Knowledge Base to connect

18\.  Configure Retrieval Settings:

<table data-header-hidden><thead><tr><th valign="top"></th><th valign="top"></th><th valign="top"></th></tr></thead><tbody><tr><td valign="top">Setting</td><td valign="top">Description</td><td valign="top">Recommended Value</td></tr><tr><td valign="top">Top K</td><td valign="top">Number of chunks returned</td><td valign="top">3-5</td></tr><tr><td valign="top">Score Threshold</td><td valign="top">Minimum score threshold</td><td valign="top">0.5-0.7</td></tr><tr><td valign="top">Retrieval Mode</td><td valign="top">Search method</td><td valign="top">Hybrid Search</td></tr></tbody></table>

&#x20;

### In Workflow / Chatflow

Use the "Knowledge Retrieval" node:

19\.  Drag the Knowledge Retrieval node onto the canvas

20\.  Select a Knowledge Base

21\.  Configure the query variable (typically {{sys.query}})

22\.  Connect the output to an LLM node as context

&#x20;

<figure><img src="/files/NvUt12GtRUaUMDvnhJQ5" alt=""><figcaption></figcaption></figure>

## Retrieval Strategies

### Semantic Search

Searches based on contextual meaning using embedding vectors.

·       ✅ Understands synonymous questions

·       ❌ May miss exact keyword matches

### Full-text Search

Searches based on exact keyword matching.

·       ✅ Precise with specific terminology

·       ❌ Doesn't understand context

### Hybrid Search (Recommended)

Combines both Semantic + Full-text, using Rerank to sort results.

·       ✅ Leverages advantages of both methods

·       ✅ Highest accuracy

📝 NOTE: \*\*Rerank\*\* is an additional processing step that re-sorts retrieval results by relevance. Enable Rerank when using Hybrid Search for the best results.

&#x20;

*📖 Previous: \[Monitor]\(./03-monitor-en.md) · Next: \[Workspace]*


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.clickai.vn/clickai-docs/clickai-docs-en/database/database-and-knowledge.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
