RagKnowledgeBase

AI & Machine Learning AI Assistants

v1.0.0

Last updated Jul 7, 2025

Advanced RAG (Retrieval-Augmented Generation) knowledge base nodes for n8n

1 Weekly Downloads

10 Monthly Downloads

View on NPM GitHub Repository

Included Nodes

RagKnowledgeBase

RAG Retrieval

Description

n8n-nodes-rag

Advanced RAG (Retrieval-Augmented Generation) knowledge base nodes for n8n.

Features

This package provides two powerful nodes for building RAG applications:

🗄️ RAG Knowledge Base Node

Text Processing: Clean and preprocess text data with customizable options
Intelligent Chunking: Multiple strategies (fixed size, sentence-based, paragraph-based)
Vector Embeddings: Support for OpenAI, Hugging Face, and custom embedding providers
Flexible Storage: Plugin interface for various vector databases (Qdrant, Milvus)
Operations: Store, delete, and count documents

🔍 RAG Retrieval Node

Multiple Search Types: Vector search, full-text search, and hybrid search
Configurable Results: Customizable limits, score thresholds, and filtering
Metadata Support: Include or exclude metadata in search results
AI-Ready Output: Formatted results perfect for AI agent workflows

Installation

npm install n8n-nodes-rag

Supported Vector Databases

Qdrant: Cloud and self-hosted vector database
Milvus: Open-source vector database
Extensible: Easy to add new vector store adapters

Supported Embedding Providers

OpenAI: text-embedding-ada-002 and other models
Hugging Face: Wide range of open-source models
Custom API: Bring your own embedding service

Quick Start

1. Setup Vector Database

Start with Qdrant (easiest option):

docker run -p 6333:6333 qdrant/qdrant

2. Create Knowledge Base

Add RAG Knowledge Base node to your workflow
Connect your text data source
Configure chunking strategy and embedding provider
Set vector database connection details
Execute to process and store your documents

3. Retrieve Information

Add RAG Retrieval node to your workflow
Configure the same vector database settings
Set your search query and parameters
Choose search type (vector, full-text, or hybrid)
Execute to get relevant results

Configuration Examples

Basic Text Processing

{
  "operation": "store",
  "chunkingStrategy": "sentence",
  "chunkSize": 1000,
  "overlap": 200,
  "generateEmbeddings": true,
  "embeddingProvider": "openai"
}

Vector Search

{
  "searchType": "vector",
  "limit": 10,
  "threshold": 0.7,
  "includeMetadata": true
}

Hybrid Search

{
  "searchType": "hybrid",
  "limit": 10,
  "alpha": 0.5,
  "threshold": 0.6
}

Use Cases

📚 Document Q&A Systems

Build intelligent document search and question-answering systems.

🤖 AI Agent Knowledge Base

Provide contextual information to AI agents and chatbots.

🔍 Semantic Search

Create powerful semantic search experiences for your applications.

📊 Content Analytics

Analyze and categorize large collections of text documents.

Architecture

Text Processing Pipeline

Input Validation: Ensure text data is properly formatted
Text Cleaning: Remove extra whitespace, normalize line breaks
Chunking: Split text using configurable strategies
Embedding Generation: Create vector representations
Storage: Save to vector database with metadata

Search Pipeline

Query Processing: Generate embeddings for search queries
Vector Search: Find semantically similar content
Full-text Search: Keyword-based matching
Hybrid Search: Combine vector and full-text results
Result Ranking: Score and filter results

Advanced Features

Custom Metadata Filtering

Filter search results based on document metadata:

{
  "source": "documentation",
  "category": "technical",
  "date": { "$gte": "2024-01-01" }
}

Chunking Strategies

Fixed Size: Split by character count with overlap
Sentence: Respect sentence boundaries
Paragraph: Maintain paragraph structure
Semantic: AI-powered semantic chunking (future)

Vector Store Adapters

Easily extend support for additional vector databases by implementing the VectorStoreAdapter interface.

API Reference

RAG Knowledge Base Node Parameters

Parameter	Type	Description
`operation`	string	Operation to perform (store/delete/count)
`inputField`	string	Field containing text data
`chunkingStrategy`	string	How to split text (fixed/sentence/paragraph)
`chunkSize`	number	Maximum chunk size in characters
`overlap`	number	Character overlap between chunks
`generateEmbeddings`	boolean	Whether to create vector embeddings
`embeddingProvider`	string	Embedding service (openai/huggingface/custom)

RAG Retrieval Node Parameters

Parameter	Type	Description
`query`	string	Search query text
`searchType`	string	Search method (vector/fulltext/hybrid)
`limit`	number	Maximum results to return
`threshold`	number	Minimum similarity score
`includeMetadata`	boolean	Include document metadata
`metadataFilter`	string	JSON filter for metadata

Troubleshooting

Common Issues

Empty results from vector search

Check that embeddings were generated during storage
Verify embedding provider settings match between store and search
Adjust similarity threshold (try lower values like 0.5)

API rate limits

Use batch processing for large documents
Implement delays between API calls
Consider using local embedding models

Vector database connection errors

Verify endpoint URL and API key
Check network connectivity
Ensure collection/index exists

Contributing

We welcome contributions! Please see our Contributing Guide for details.

License

MIT License – see LICENSE for details.

Support

Built with ❤️ for the n8n community