> For the complete documentation index, see [llms.txt](https://docs.clickai.vn/clickai-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clickai.vn/clickai-docs/clickai-docs-en/database/database-and-knowledge.md). # Database & Knowledge

## Table of Contents · \[Introduction to RAG]\(#introduction-to-rag) · \[Practical Use Cases]\(#practical-use-cases) · \[Creating a Knowledge Base]\(#creating-a-knowledge-base) · \[Management & Optimization]\(#management--optimization) · \[Integrating into Applications]\(#integrating-into-applications) · \[Retrieval Strategies]\(#retrieval-strategies) ## Introduction to RAG RAG (Retrieval-Augmented Generation) is the core architecture of the Knowledge system in ClickAI: 1\. Retrieval — When a user asks a question, the system first retrieves the most relevant information from the incorporated knowledge. 2\. Augmented — This retrieved information is then combined with the user's original query and sent to the LLM as augmented context. 3\. Generation — The LLM uses this context to generate a more precise answer. 💡 TIP: RAG enables AI to answer based on your up-to-date data rather than relying solely on pre-training knowledge, significantly reducing hallucination. ## Practical Use Cases


Use Case	Description
Customer Support Bot	Accurate answers from product docs, FAQs, troubleshooting guides
Internal Knowledge Portal	AI search & Q&A for employees accessing company policies and procedures
Content Generation	Generate reports, articles, emails based on background materials
Research & Analysis	Retrieve and summarize from research repositories, market reports

## Creating a Knowledge Base ClickAI provides 3 ways to create a Knowledge Base: ### Method 1: Quick Create (Recommended for beginners) The fastest way — import data, define processing rules, and ClickAI handles the rest. Steps: 4\. Go to Knowledge from sidebar → Click "+ Create Knowledge" 5\. Name and describe your Knowledge Base 6\. Upload documents — Supported formats:


Format	Type
.pdf	PDF documents
.docx .doc	Microsoft Word
.txt	Plain text
.md	Markdown
.csv	Data tables
.html	Web pages
.xlsx	Excel spreadsheets

7\. Configure chunking — How to split documents:


Method	Description	When to Use
Automatic	ClickAI optimally splits	Most cases
Custom	Custom separator, chunk size	Specially structured documents
Parent-Child	Large chunks containing smaller chunks	Need broader context

8\. Select Embedding Model — Model to convert text into vectors 9\. Choose Index Method: · High Quality: Uses Embedding Model (more accurate, uses tokens) · Economical: Uses keyword indexing (saves costs, less accurate) 10\. Click "Save & Process" → Wait for indexing to complete ### Method 2: Knowledge Pipeline (Advanced) Orchestrate more complex data processing workflows with custom steps and plugins. · Suitable for complex data pre-processing needs · Supports ETL (Extract-Transform-Load) plugins · Reusable workflows ### Method 3: External Knowledge Base Connect directly to external knowledge bases via APIs. · Leverage existing data without migration · Supports vector databases: Pinecone, Weaviate, Qdrant, ... · Real-time synchronization ⚠️ IMPORTANT: Knowledge Base quality directly determines AI response quality. Invest time in preparing clean, well-structured documents. ## Management & Optimization ### Content Management


Action	Description
View documents	See list of uploaded documents
Add documents	Upload new documents
Edit chunks	Edit chunk content
Enable/Disable	Toggle document or chunk
Delete	Remove unneeded documents
Re-index	Re-index when settings change

### Test & Validate Retrieval Test retrieval quality before integrating into applications: 11\. Open Knowledge Base → Click "Test Retrieval" 12\. Enter a test query (simulating user questions) 13\. Click "Run Test" 14\. Review results: · Matched chunks — Retrieved document segments · Relevance score — Relevance rating (0-1) · Source document — Source document reference 💡 TIP: Test with diverse questions to ensure the Knowledge Base covers all topics. If relevance scores are low (below 0.5), consider improving document content. ### Metadata Enhancement Add metadata to documents for improved retrieval accuracy: \# Example Metadata\ document: "Warranty Policy 2024.pdf"\ metadata:\ category: "policy"\ department: "customer-service"\ effective\_date: "2024-01-01"\ language: "en" Metadata enables filter-based search, e.g., "Only search within category=policy documents." ### Adjusting Settings You can change at any time: · Index method — Switch between High Quality and Economical · Embedding model — Change to a different embedding model · Retrieval strategy — Semantic Search, Full-text Search, or Hybrid

## Integrating into Applications ### In Chatbot / Agent 15\. Open the app in Studio 16\. In the Context section, click "+ Add" 17\. Select the Knowledge Base to connect 18\. Configure Retrieval Settings:


Setting	Description	Recommended Value
Top K	Number of chunks returned	3-5
Score Threshold	Minimum score threshold	0.5-0.7
Retrieval Mode	Search method	Hybrid Search

### In Workflow / Chatflow Use the "Knowledge Retrieval" node: 19\. Drag the Knowledge Retrieval node onto the canvas 20\. Select a Knowledge Base 21\. Configure the query variable (typically {{sys.query}}) 22\. Connect the output to an LLM node as context

## Retrieval Strategies ### Semantic Search Searches based on contextual meaning using embedding vectors. · ✅ Understands synonymous questions · ❌ May miss exact keyword matches ### Full-text Search Searches based on exact keyword matching. · ✅ Precise with specific terminology · ❌ Doesn't understand context ### Hybrid Search (Recommended) Combines both Semantic + Full-text, using Rerank to sort results. · ✅ Leverages advantages of both methods · ✅ Highest accuracy 📝 NOTE: \*\*Rerank\*\* is an additional processing step that re-sorts retrieval results by relevance. Enable Rerank when using Hybrid Search for the best results. *📖 Previous: \[Monitor]\(./03-monitor-en.md) · Next: \[Workspace]* --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://docs.clickai.vn/clickai-docs/clickai-docs-en/database/database-and-knowledge.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.