Skip to content

paganini2008/fastrag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 RAG Platform

Production-grade Multi-tenant Retrieval-Augmented Generation Platform

Python Django React TypeScript Tailwind CSS Qdrant Tests License

A full-stack, multi-tenant RAG platform that turns documents, URLs, and FAQs into a searchable knowledge base with AI-powered question answering. Built with Django + React, powered by OpenAI embeddings and Claude / GPT-4o LLMs.

Features Β· Architecture Β· Tech Stack Β· Quick Start Β· API Β· Testing


✨ Features

Knowledge Management

  • πŸ“ Upload PDF, DOCX, XLSX, PPTX, TXT, HTML, Markdown
  • 🌐 Import web pages (static + Playwright dynamic rendering)
  • ❓ FAQ management with bulk import
  • πŸ”„ Automatic async ingestion pipeline
  • ♻️ Per-KB reindex with embedding model switching + live progress

AI-Powered Retrieval

  • πŸ” Vector similarity search (cosine, top-K)
  • πŸ€– Full RAG answers via Claude or GPT-4o
  • πŸ“Š Score-based relevance ranking
  • πŸ“ Prompt builder with token estimation

Multi-tenant & Secure

  • 🏒 Full tenant isolation at DB + Qdrant level
  • πŸ”‘ JWT authentication + API Key for agents
  • πŸ‘₯ Role-based access (owner / admin / member)
  • πŸ“‹ Audit logs for every search & answer

Developer Experience

  • πŸ“– Auto-generated Swagger / OpenAPI 3.0 docs
  • ⚑ Celery async processing with Redis
  • πŸ§ͺ 27 pytest tests, all mocked externals
  • 🐳 Docker-ready, uv package manager

πŸ— Architecture

System Overview

graph TB
    subgraph Client["πŸ–₯️ Browser (React 19 + Vite)"]
        UI[React UI<br/>Tailwind CSS + Ant Design]
        RTK[RTK Query<br/>State & Cache]
        AUTH[JWT Auth<br/>Redux Slice]
    end

    subgraph Gateway["🌐 API Gateway"]
        VITE_PROXY[Vite Dev Proxy<br/>:5173 β†’ :8000]
        NGINX[Nginx<br/>Production Reverse Proxy]
    end

    subgraph Backend["βš™οΈ Django 6 + DRF"]
        direction TB
        URLS[URL Router<br/>/api/v1/]
        MW[TenantMiddleware<br/>JWT β†’ tenant_id]

        subgraph Apps["Django Apps"]
            AUTH_APP[accounts<br/>JWT + API Key Auth]
            KB[knowledge_bases<br/>CRUD]
            DOCS[documents<br/>Upload Β· URL Β· Chunks]
            FAQ_APP[faq<br/>Q&A Management]
            RETRIEVAL[retrieval<br/>Search Β· Prompt Β· Answer]
            AUDIT[audit<br/>Retrieval & Query Logs]
        end

        subgraph Pipeline["Ingestion Pipeline"]
            direction LR
            PARSE[parsers<br/>PDFΒ·DOCXΒ·XLSXΒ·HTML]
            CHUNK[chunking<br/>LlamaIndex Splitter]
            EMBED_SVC[embeddings<br/>OpenAI API]
            VSTORE[vector_store<br/>Qdrant Client]
        end
    end

    subgraph Workers["πŸ”„ Celery Workers"]
        INGEST[ingest_document<br/>parse→chunk→embed→index]
        INGEST_URL[ingest_url<br/>fetch→parse→chunk→embed]
        EMBED_FAQ[embed_faq_item<br/>question+answer→embed]
    end

    subgraph Storage["πŸ’Ύ Data Layer"]
        PG[(PostgreSQL<br/>schema: rag)]
        QDRANT[(Qdrant<br/>document_chunks)]
        MINIO[(MinIO<br/>raw files)]
        REDIS[(Redis<br/>Celery broker)]
    end

    subgraph LLM["πŸ€– AI Services"]
        OPENAI[OpenAI<br/>text-embedding-3-small<br/>GPT-4o / GPT-4o-mini]
        ANTHROPIC[Anthropic<br/>Claude Sonnet 4.6<br/>Claude Haiku 4.5]
    end

    UI --> RTK --> VITE_PROXY --> URLS
    URLS --> MW --> Apps
    AUTH_APP --> PG
    KB --> PG
    DOCS --> PG
    DOCS --> MINIO
    DOCS --> REDIS
    FAQ_APP --> PG
    FAQ_APP --> REDIS
    RETRIEVAL --> QDRANT
    RETRIEVAL --> LLM
    AUDIT --> PG

    REDIS --> Workers
    Workers --> PARSE --> CHUNK --> EMBED_SVC --> VSTORE
    EMBED_SVC --> OPENAI
    VSTORE --> QDRANT
    Workers --> MINIO
Loading

RAG Ingestion Pipeline

flowchart LR
    A([πŸ“„ File Upload\nor URL]) --> B

    subgraph B["β‘  Parse"]
        B1[PDF β†’ pypdf\nDOCX β†’ python-docx\nXLSX β†’ openpyxl\nHTML β†’ BeautifulSoup4]
    end

    B --> C

    subgraph C["β‘‘ Chunk"]
        C1[LlamaIndex\nSentenceSplitter\nchunk_size=512 tokens\noverlap=64 tokens]
    end

    C --> D

    subgraph D["β‘’ Embed"]
        D1[OpenAI\ntext-embedding-3-small\nor 3-large per KB]
    end

    D --> E

    subgraph E["β‘£ Index"]
        E1[Qdrant upsert\nper-KB collection\nor shared collection]
    end

    E --> F([βœ… indexed\nDocument.status])

    style A fill:#4f46e5,color:#fff
    style F fill:#10b981,color:#fff
Loading

RAG Query Flow

sequenceDiagram
    actor User
    participant FE as React Frontend
    participant API as Django API
    participant QD as Qdrant
    participant LLM as Claude / GPT-4o
    participant DB as PostgreSQL

    User->>FE: Ask a question
    FE->>API: POST /api/v1/rag/answer/\n{query, knowledge_base_id, llm_model}

    API->>API: Embed query\n(OpenAI text-embedding-3-small)
    API->>QD: query_points(vector, filter={tenant_id, kb_id}, top_k=5)
    QD-->>API: Top-K chunks with scores

    API->>API: Build prompt\n(system + context + question)
    API->>LLM: Chat completion request
    LLM-->>API: Generated answer

    API->>DB: Save QueryLog\n(query, answer, tokens, latency_ms)
    API-->>FE: {answer, sources, usage, latency_ms}
    FE-->>User: Display answer + cited sources
Loading

Multi-tenant Data Isolation

erDiagram
    TENANT {
        uuid id PK
        string name
        string slug
        string plan
    }
    USER {
        uuid id PK
        uuid tenant_id FK
        string email
        string role
    }
    KNOWLEDGE_BASE {
        uuid id PK
        uuid tenant_id FK
        string name
        int chunk_size
        int retrieval_top_k
    }
    DOCUMENT {
        uuid id PK
        uuid tenant_id FK
        uuid knowledge_base_id FK
        string status
        string file_path
    }
    DOCUMENT_CHUNK {
        uuid id PK
        uuid tenant_id FK
        uuid document_id FK
        text text
        int chunk_index
        bool is_embedded
    }
    FAQ_ITEM {
        uuid id PK
        uuid tenant_id FK
        uuid knowledge_base_id FK
        text question
        text answer
        bool is_embedded
    }

    TENANT ||--o{ USER : has
    TENANT ||--o{ KNOWLEDGE_BASE : owns
    KNOWLEDGE_BASE ||--o{ DOCUMENT : contains
    KNOWLEDGE_BASE ||--o{ FAQ_ITEM : contains
    DOCUMENT ||--o{ DOCUMENT_CHUNK : split_into
Loading

πŸ›  Tech Stack

Backend

Layer Technology Purpose
Runtime Python 3.13 + uv Fast dependency management
Framework Django 6 + DRF 3.16 Web framework + REST APIs
Auth simplejwt + API Key JWT tokens + agent access
Task Queue Celery 5 + Redis Async ingestion pipeline
Vector DB Qdrant 1.17 Cosine similarity search
Object Storage MinIO Raw file storage (S3-compatible)
Database PostgreSQL (schema: rag) Structured data
Chunking LlamaIndex SentenceSplitter / SemanticSplitter Token-aware text splitting
Embedding OpenAI text-embedding-3-small / 3-large Per-KB configurable, 1536 / 3072-dim
LLM Claude Sonnet 4.6 / GPT-4o Answer generation
API Docs drf-spectacular OpenAPI 3.0 / Swagger
Document Parsing pypdf Β· python-docx Β· openpyxl Β· python-pptx Β· BeautifulSoup4 Multi-format support

Frontend

Technology Purpose
React 19 + TypeScript 5.9 UI framework
Vite 7 Build tool + dev proxy
Redux Toolkit + RTK Query State management + data fetching
React Router 7 Client-side routing
Ant Design 6 UI components (tables, modals, forms)
Tailwind CSS 4 Dark theme + layout + utilities

πŸš€ Quick Start

Prerequisites

# Start infrastructure services
docker run -d -p 6333:6333 qdrant/qdrant
docker run -d -p 19000:9000 -e MINIO_ROOT_USER=admin -e MINIO_ROOT_PASSWORD=admin123 \
  minio/minio server /data --console-address ":9001"
redis-server --requirepass yourpassword

Backend

cd backend

# Install dependencies (uv auto-creates virtualenv)
uv sync

# Configure environment
cp .env.example .env
# Edit .env: set DB credentials, OPENAI_KEY, ANTHROPIC_API_KEY, REDIS_URL

# Database setup
uv run python src/manage.py migrate
uv run python src/manage.py init_qdrant   # Create Qdrant collection
uv run python src/manage.py createsuperuser

# Run server
uv run python src/manage.py runserver     # http://localhost:8000

# Run Celery worker (separate terminal)
cd src && uv run celery -A config.celery worker --loglevel=info

Frontend

cd frontend
npm install
npm run dev     # http://localhost:5173

Dev proxy: All /api/* requests from :5173 are automatically forwarded to :8000 by Vite β€” no CORS config needed.


πŸ“ Project Structure

rag/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ .env                      # All configuration
β”‚   β”œβ”€β”€ pyproject.toml            # uv dependencies
β”‚   └── src/
β”‚       β”œβ”€β”€ config/
β”‚       β”‚   β”œβ”€β”€ settings/         # base / development / production
β”‚       β”‚   β”œβ”€β”€ api_router.py     # /api/v1/ route registration
β”‚       β”‚   └── celery.py         # Celery app
β”‚       β”œβ”€β”€ apps/
β”‚       β”‚   β”œβ”€β”€ common/           # Base models, MinIO client, pagination
β”‚       β”‚   β”œβ”€β”€ tenants/          # Tenant model + TenantMiddleware
β”‚       β”‚   β”œβ”€β”€ accounts/         # User auth (JWT + API Key)
β”‚       β”‚   β”œβ”€β”€ knowledge_bases/  # KB CRUD
β”‚       β”‚   β”œβ”€β”€ documents/        # Upload, URL import, chunk preview
β”‚       β”‚   β”œβ”€β”€ faq/              # FAQ management + bulk import
β”‚       β”‚   β”œβ”€β”€ ingestion/        # Celery task orchestration
β”‚       β”‚   β”œβ”€β”€ parsers/          # PDF/DOCX/XLSX/PPTX/HTML parsers
β”‚       β”‚   β”œβ”€β”€ chunking/         # LlamaIndex SentenceSplitter / SemanticSplitter
β”‚       β”‚   β”œβ”€β”€ embeddings/       # OpenAI embedding service
β”‚       β”‚   β”œβ”€β”€ vector_store/     # Qdrant client wrapper
β”‚       β”‚   β”œβ”€β”€ retrieval/        # Search + prompt builder + RAG answer
β”‚       β”‚   └── audit/            # Search & query logs
β”‚       └── tests/                # 27 pytest tests
β”‚
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ .env                      # VITE_API_URL, VITE_APP_NAME
β”‚   └── src/
β”‚       β”œβ”€β”€ components/Layout/    # Collapsible sidebar + sticky header
β”‚       β”œβ”€β”€ pages/                # Login, Dashboard, KB, Docs, FAQ,
β”‚       β”‚                         # Retrieval, Jobs, Logs
β”‚       β”œβ”€β”€ store/
β”‚       β”‚   β”œβ”€β”€ api/              # RTK Query endpoints (4 APIs)
β”‚       β”‚   └── slices/authSlice  # JWT token management
β”‚       └── index.css             # Tailwind v4 + Ant Design dark theme
β”‚
└── docs/
    └── README.md                 # Full technical documentation (CN)

πŸ”Œ API Overview

All endpoints are prefixed with /api/v1/. Interactive docs at /api/schema/swagger-ui/.

Authentication

POST /auth/login/          # Returns access + refresh JWT
POST /auth/token/refresh/  # Refresh access token
GET  /auth/me/             # Current user info

Tenant Settings

GET   /tenants/settings/  # Get tenant-level defaults (embedding_model, llm_model)
PATCH /tenants/settings/  # Update tenant-level defaults

Knowledge Bases

GET    /knowledge-bases/            # List (paginated)
POST   /knowledge-bases/            # Create
PATCH  /knowledge-bases/{id}/       # Update settings
DELETE /knowledge-bases/{id}/       # Delete
POST   /knowledge-bases/{id}/rebuild/  # Async reindex with new embedding model

Documents

POST /knowledge-bases/{kbId}/documents/upload/       # Upload file (multipart)
POST /knowledge-bases/{kbId}/documents/import-url/   # Import URL
GET  /knowledge-bases/{kbId}/documents/{id}/chunks/  # Preview chunks
POST /knowledge-bases/{kbId}/documents/{id}/reindex/ # Re-trigger pipeline

FAQ

GET    /knowledge-bases/{kbId}/faq/               # List FAQ items
POST   /knowledge-bases/{kbId}/faq/               # Create (auto-embeds)
POST   /knowledge-bases/{kbId}/faq/bulk-import/   # Bulk import
PATCH  /knowledge-bases/{kbId}/faq/{id}/          # Update
DELETE /knowledge-bases/{kbId}/faq/{id}/          # Delete

Retrieval & RAG

POST /retrieval/search/  # Vector search β€” returns top-K chunks with scores
POST /rag/prompt/        # Build RAG prompt (no LLM call)
POST /rag/answer/        # Full RAG: retrieve + LLM β†’ answer + sources

Search request:

{
  "query": "What is RAG?",
  "knowledge_base_id": "uuid",
  "top_k": 5,
  "score_threshold": 0.0
}

Answer request:

{
  "query": "What is RAG?",
  "knowledge_base_id": "uuid",
  "top_k": 5,
  "llm_model": "claude-sonnet-4-6"
}

Answer response:

{
  "answer": "RAG (Retrieval-Augmented Generation) is...",
  "sources": [
    { "source": "intro.pdf Β· page 3", "score": 0.921 }
  ],
  "usage": { "prompt_tokens": 1240, "completion_tokens": 187, "total_tokens": 1427 },
  "latency_ms": 1267
}

βš™οΈ Configuration

Backend .env

# Django
DJANGO_SETTINGS_MODULE=config.settings.development
SECRET_KEY=your-secret-key

# PostgreSQL
DB_HOST=localhost
DB_PORT=5432
DB_DATABASE=demo
DB_USER=postgres
DB_PASSWORD=yourpassword
DB_SCHEMA=rag

# MinIO
MINIO_URL=localhost:19000
MINIO_USER=admin
MINIO_PASSWORD=admin123
MINIO_BUCKET=rag-documents

# Qdrant
VDB_HOST=localhost
VDB_PORT=6333
QDRANT_COLLECTION=document_chunks

# Redis / Celery
REDIS_URL=redis://:yourpassword@localhost:6379/0

# AI Keys
OPENAI_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...
DEFAULT_LLM_MODEL=claude-sonnet-4-6

Frontend .env

VITE_API_URL=http://localhost:8000
VITE_APP_NAME=RAG Platform

πŸ§ͺ Testing

cd backend

# Run all 27 tests
uv run pytest

# With reuse-db (faster on re-runs)
uv run pytest --reuse-db

# Specific file
uv run pytest src/tests/test_retrieval.py -v

# With coverage
uv run pytest --cov=apps --cov-report=html

Test strategy:

  • External services (MinIO, Qdrant, LLMs, background tasks) are all mocked β€” tests run without any infrastructure
  • Real PostgreSQL is used with a test_ prefixed database
  • conftest.py auto-creates the rag schema if missing
βœ“ test_auth.py            β€” login, token refresh, protected endpoints
βœ“ test_knowledge_bases.py β€” CRUD, tenant isolation
βœ“ test_documents.py       β€” upload, URL import, chunk preview
βœ“ test_faq.py             β€” create, list, delete, bulk import
βœ“ test_retrieval.py       β€” vector search, RAG answer, error handling

27 passed in 4.5s

🐳 Docker

# Build images
docker build -t rag-backend  ./backend
docker build -t rag-frontend ./frontend

# Run backend (point to your infra)
docker run -d --env-file backend/.env -p 8000:8000 rag-backend

# Run frontend
docker run -d -p 80:80 rag-frontend

πŸ“– Pages

Page Route Description
Login /login Email + password, JWT stored in localStorage
Dashboard /dashboard Stats overview, KB summary, quick actions
Knowledge Bases /knowledge-bases Create/edit KBs, configure chunk size & top-K, reindex with model switching
Documents /knowledge-bases/:id/documents Upload files, import URLs, preview chunks
FAQ /knowledge-bases/:id/faq Manage Q&A pairs, view embedding status
Retrieval Test /retrieval Test vector search and RAG answers interactively
Jobs /jobs Monitor ingestion pipeline progress per document
Logs /logs Retrieval logs and RAG query logs with latency

πŸ“„ License

MIT License β€” see LICENSE for details.


Built with Django Β· React Β· Qdrant Β· OpenAI Β· Anthropic Β· Tailwind CSS

About

A full-stack, multi-tenant RAG platform that turns documents, URLs, and FAQs into a searchable knowledge base with AI-powered question answering. Built with Django + React, powered by OpenAI embeddings and Claude / GPT-4o LLMs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors