Database & Knowledge
Build and manage your knowledge repository so AI responds accurately based on your own data. ClickAI Knowledge Base

Table of Contents
· [Introduction to RAG](#introduction-to-rag)
· [Practical Use Cases](#practical-use-cases)
· [Creating a Knowledge Base](#creating-a-knowledge-base)
· [Management & Optimization](#management--optimization)
· [Integrating into Applications](#integrating-into-applications)
· [Retrieval Strategies](#retrieval-strategies)
Introduction to RAG
RAG (Retrieval-Augmented Generation) is the core architecture of the Knowledge system in ClickAI:
1. Retrieval — When a user asks a question, the system first retrieves the most relevant information from the incorporated knowledge.
2. Augmented — This retrieved information is then combined with the user's original query and sent to the LLM as augmented context.
3. Generation — The LLM uses this context to generate a more precise answer.
💡 TIP: RAG enables AI to answer based on your up-to-date data rather than relying solely on pre-training knowledge, significantly reducing hallucination.
Practical Use Cases
Use Case
Description
Customer Support Bot
Accurate answers from product docs, FAQs, troubleshooting guides
Internal Knowledge Portal
AI search & Q&A for employees accessing company policies and procedures
Content Generation
Generate reports, articles, emails based on background materials
Research & Analysis
Retrieve and summarize from research repositories, market reports

Creating a Knowledge Base
ClickAI provides 3 ways to create a Knowledge Base:
Method 1: Quick Create (Recommended for beginners)
The fastest way — import data, define processing rules, and ClickAI handles the rest.
Steps:
4. Go to Knowledge from sidebar → Click "+ Create Knowledge"
5. Name and describe your Knowledge Base
6. Upload documents — Supported formats:
Format
Type
PDF documents
.docx .doc
Microsoft Word
.txt
Plain text
.md
Markdown
.csv
Data tables
.html
Web pages
.xlsx
Excel spreadsheets
7. Configure chunking — How to split documents:
Method
Description
When to Use
Automatic
ClickAI optimally splits
Most cases
Custom
Custom separator, chunk size
Specially structured documents
Parent-Child
Large chunks containing smaller chunks
Need broader context
8. Select Embedding Model — Model to convert text into vectors
9. Choose Index Method:
· High Quality: Uses Embedding Model (more accurate, uses tokens)
· Economical: Uses keyword indexing (saves costs, less accurate)
10. Click "Save & Process" → Wait for indexing to complete
Method 2: Knowledge Pipeline (Advanced)
Orchestrate more complex data processing workflows with custom steps and plugins.
· Suitable for complex data pre-processing needs
· Supports ETL (Extract-Transform-Load) plugins
· Reusable workflows
Method 3: External Knowledge Base
Connect directly to external knowledge bases via APIs.
· Leverage existing data without migration
· Supports vector databases: Pinecone, Weaviate, Qdrant, ...
· Real-time synchronization
⚠️ IMPORTANT: Knowledge Base quality directly determines AI response quality. Invest time in preparing clean, well-structured documents.
Management & Optimization
Content Management
Action
Description
View documents
See list of uploaded documents
Add documents
Upload new documents
Edit chunks
Edit chunk content
Enable/Disable
Toggle document or chunk
Delete
Remove unneeded documents
Re-index
Re-index when settings change
Test & Validate Retrieval
Test retrieval quality before integrating into applications:
11. Open Knowledge Base → Click "Test Retrieval"
12. Enter a test query (simulating user questions)
13. Click "Run Test"
14. Review results:
· Matched chunks — Retrieved document segments
· Relevance score — Relevance rating (0-1)
· Source document — Source document reference
💡 TIP: Test with diverse questions to ensure the Knowledge Base covers all topics. If relevance scores are low (below 0.5), consider improving document content.
Metadata Enhancement
Add metadata to documents for improved retrieval accuracy:
# Example Metadata document: "Warranty Policy 2024.pdf" metadata: category: "policy" department: "customer-service" effective_date: "2024-01-01" language: "en"
Metadata enables filter-based search, e.g., "Only search within category=policy documents."
Adjusting Settings
You can change at any time:
· Index method — Switch between High Quality and Economical
· Embedding model — Change to a different embedding model
· Retrieval strategy — Semantic Search, Full-text Search, or Hybrid

Integrating into Applications
In Chatbot / Agent
15. Open the app in Studio
16. In the Context section, click "+ Add"
17. Select the Knowledge Base to connect
18. Configure Retrieval Settings:
Setting
Description
Recommended Value
Top K
Number of chunks returned
3-5
Score Threshold
Minimum score threshold
0.5-0.7
Retrieval Mode
Search method
Hybrid Search
In Workflow / Chatflow
Use the "Knowledge Retrieval" node:
19. Drag the Knowledge Retrieval node onto the canvas
20. Select a Knowledge Base
21. Configure the query variable (typically {{sys.query}})
22. Connect the output to an LLM node as context

Retrieval Strategies
Semantic Search
Searches based on contextual meaning using embedding vectors.
· ✅ Understands synonymous questions
· ❌ May miss exact keyword matches
Full-text Search
Searches based on exact keyword matching.
· ✅ Precise with specific terminology
· ❌ Doesn't understand context
Hybrid Search (Recommended)
Combines both Semantic + Full-text, using Rerank to sort results.
· ✅ Leverages advantages of both methods
· ✅ Highest accuracy
📝 NOTE: **Rerank** is an additional processing step that re-sorts retrieval results by relevance. Enable Rerank when using Hybrid Search for the best results.
📖 Previous: [Monitor](./03-monitor-en.md) · Next: [Workspace]
Last updated