Database & Knowledge

Build and manage your knowledge repository so AI responds accurately based on your own data. ClickAI Knowledge Base

Table of Contents

· [Introduction to RAG](#introduction-to-rag)

· [Practical Use Cases](#practical-use-cases)

· [Creating a Knowledge Base](#creating-a-knowledge-base)

· [Management & Optimization](#management--optimization)

· [Integrating into Applications](#integrating-into-applications)

· [Retrieval Strategies](#retrieval-strategies)

Introduction to RAG

RAG (Retrieval-Augmented Generation) is the core architecture of the Knowledge system in ClickAI:

1. Retrieval — When a user asks a question, the system first retrieves the most relevant information from the incorporated knowledge.

2. Augmented — This retrieved information is then combined with the user's original query and sent to the LLM as augmented context.

3. Generation — The LLM uses this context to generate a more precise answer.

💡 TIP: RAG enables AI to answer based on your up-to-date data rather than relying solely on pre-training knowledge, significantly reducing hallucination.

Practical Use Cases

Use Case

Description

Customer Support Bot

Accurate answers from product docs, FAQs, troubleshooting guides

Internal Knowledge Portal

AI search & Q&A for employees accessing company policies and procedures

Content Generation

Generate reports, articles, emails based on background materials

Research & Analysis

Retrieve and summarize from research repositories, market reports

Creating a Knowledge Base

ClickAI provides 3 ways to create a Knowledge Base:

The fastest way — import data, define processing rules, and ClickAI handles the rest.

Steps:

4. Go to Knowledge from sidebar → Click "+ Create Knowledge"

5. Name and describe your Knowledge Base

6. Upload documents — Supported formats:

Format

Type

.pdf

PDF documents

.docx .doc

Microsoft Word

.txt

Plain text

.md

Markdown

.csv

Data tables

.html

Web pages

.xlsx

Excel spreadsheets

7. Configure chunking — How to split documents:

Method

Description

When to Use

Automatic

ClickAI optimally splits

Most cases

Custom

Custom separator, chunk size

Specially structured documents

Parent-Child

Large chunks containing smaller chunks

Need broader context

8. Select Embedding Model — Model to convert text into vectors

9. Choose Index Method:

· High Quality: Uses Embedding Model (more accurate, uses tokens)

· Economical: Uses keyword indexing (saves costs, less accurate)

10. Click "Save & Process" → Wait for indexing to complete

Method 2: Knowledge Pipeline (Advanced)

Orchestrate more complex data processing workflows with custom steps and plugins.

· Suitable for complex data pre-processing needs

· Supports ETL (Extract-Transform-Load) plugins

· Reusable workflows

Method 3: External Knowledge Base

Connect directly to external knowledge bases via APIs.

· Leverage existing data without migration

· Supports vector databases: Pinecone, Weaviate, Qdrant, ...

· Real-time synchronization

⚠️ IMPORTANT: Knowledge Base quality directly determines AI response quality. Invest time in preparing clean, well-structured documents.

Management & Optimization

Content Management

Action

Description

View documents

See list of uploaded documents

Add documents

Upload new documents

Edit chunks

Edit chunk content

Enable/Disable

Toggle document or chunk

Delete

Remove unneeded documents

Re-index

Re-index when settings change

Test & Validate Retrieval

Test retrieval quality before integrating into applications:

11. Open Knowledge Base → Click "Test Retrieval"

12. Enter a test query (simulating user questions)

13. Click "Run Test"

14. Review results:

· Matched chunks — Retrieved document segments

· Relevance score — Relevance rating (0-1)

· Source document — Source document reference

💡 TIP: Test with diverse questions to ensure the Knowledge Base covers all topics. If relevance scores are low (below 0.5), consider improving document content.

Metadata Enhancement

Add metadata to documents for improved retrieval accuracy:

# Example Metadata document: "Warranty Policy 2024.pdf" metadata: category: "policy" department: "customer-service" effective_date: "2024-01-01" language: "en"

Metadata enables filter-based search, e.g., "Only search within category=policy documents."

Adjusting Settings

You can change at any time:

· Index method — Switch between High Quality and Economical

· Embedding model — Change to a different embedding model

· Retrieval strategy — Semantic Search, Full-text Search, or Hybrid

Integrating into Applications

In Chatbot / Agent

15. Open the app in Studio

16. In the Context section, click "+ Add"

17. Select the Knowledge Base to connect

18. Configure Retrieval Settings:

Setting

Description

Recommended Value

Top K

Number of chunks returned

3-5

Score Threshold

Minimum score threshold

0.5-0.7

Retrieval Mode

Search method

Hybrid Search

In Workflow / Chatflow

Use the "Knowledge Retrieval" node:

19. Drag the Knowledge Retrieval node onto the canvas

20. Select a Knowledge Base

21. Configure the query variable (typically {{sys.query}})

22. Connect the output to an LLM node as context

Retrieval Strategies

Searches based on contextual meaning using embedding vectors.

· ✅ Understands synonymous questions

· ❌ May miss exact keyword matches

Searches based on exact keyword matching.

· ✅ Precise with specific terminology

· ❌ Doesn't understand context

Combines both Semantic + Full-text, using Rerank to sort results.

· ✅ Leverages advantages of both methods

· ✅ Highest accuracy

📝 NOTE: **Rerank** is an additional processing step that re-sorts retrieval results by relevance. Enable Rerank when using Hybrid Search for the best results.

📖 Previous: [Monitor](./03-monitor-en.md) · Next: [Workspace]

Last updated