RAGFlow
RAGFlow is an open-source, AI-native engine for building Retrieval-Augmented Generation (RAG) pipelines that enhance large language models (LLMs) with accurate, grounded knowledge. It supports multimodal data processing (text, PDF, images, etc.), hybrid search, chunking strategies, and in-text citation generation to reduce hallucinations and boost LLM reliability.
Multimodal Support
Index and retrieve content from PDFs, images, web pages, presentations, and more.
Hybrid Search Pipeline
Combine dense (vector), sparse (keyword), and reranking search methods for optimal results.
Custom Chunking & Templates
Use smart chunking strategies (template-based or text-based) to maintain semantic context.
Citations & Traceability
Automatically inject grounded, in-text citations to support transparency and reduce hallucinations.
Flexible Retrieval Architecture
Works with OpenAI-compatible LLMs, local models, and external vector databases.
Built-in Orchestration & APIs
REST API support for custom RAG pipelines with full control over indexing, retrieval, and ranking steps.
Lightweight & Fast
Designed for performance and modularity, deployable with minimal resources.