BACK TO PORTFOLIO
← SWIPE →
1 / 9
← → ARROW KEYS
PRODUCTION AI SYSTEMS · FIELD DOCUMENTATION
Indian Legal AI Platform

Production-Grade RAG System — Architecture Deep Dive

LangGraph Qdrant FastAPI Presidio LLMOps GDPR
Ambuj Kumar Tripathi
GenAI Solution Architect · RAG Systems Specialist
31,528
Chunks Indexed
6
Indian Legal Acts
768
Vector Dimensions
₹0
Monthly Cost

Indian Legal System is Inaccessible

5,000+
Legal Acts

India has thousands of central and state acts. Average citizen cannot access or understand them. Lawyers charge per query.

2023
IPC Replaced by BNS

Major criminal code overhaul. Old information is actively dangerous. Static fine-tuned models go stale immediately.

No
Simple Answer

Legal queries need source citations, exact section numbers, and context — not hallucinated summaries that could cause harm.

Solution → A production RAG system that retrieves exact legal provisions with source citations — and re-indexes in minutes when laws change.

Production RAG · Zero Cost · Live Right Now

31,528
Chunks Indexed
Sub-5s
Response Latency
₹0
Monthly Infrastructure
LangGraph 6-Node State Machine
CLASSIFY → PII MASK → RETRIEVE → GENERATE → LOG
Parent-Child Chunking
400-char search · 2000-char LLM context · single Qdrant lookup
SHA-256 Sync Engine
Zero orphaned vectors · idempotent · hash-based change detection
GDPR Compliant
Presidio PII masking · MongoDB 30-day TTL · Circuit Breaker

LangGraph 6-Node State Machine

USER
input
CLASSIFY
greeting/rag/abuse
RETRIEVE
PII → Qdrant
GENERATE
Qwen 3 235B
POST PROCESS
MongoDB+Langfuse
GREETING PATH
CLASSIFY → GREET → END
150 tokens · saves cost
ABUSIVE PATH
CLASSIFY → REJECT → END
NOT saved to MongoDB
RAG PATH
Full pipeline · Confidence gate
<40% → fallback, no LLM call

Parent-Child Chunking Strategy

❌ Simple Chunking Problem
Small chunks (400 chars): Precise search but poor LLM context
Large chunks (2000 chars): Good context but imprecise retrieval
✓ Two-Level Solution
Search: Child chunks (400 chars, 50 overlap) in Qdrant
Retrieve: Parent text (2000 chars) sent to LLM
📄 Constitution.pdf — Parent Chunk #001 (2000 chars)
child #1
400c
child #2
400c
child #3
400c
child #4
400c
child #5
400c
📄 Constitution.pdf — Parent Chunk #002 (2000 chars)
child #6
400c
child #7
400c
child #8
400c
child #9
400c
child #10
400c
Query hits child → retrieves parent → LLM gets rich context
Result: 28,352 child vectors · 3,176 parent chunks · precision + context

SHA-256 Sync Engine — Zero Orphaned Vectors

NEW FILE
No registry entry found
→ Index to Qdrant
→ Create registry row
→ Upsert vectors
HASH MISMATCH
File content changed
→ Delete ALL old vectors
→ Re-index fresh
→ Update registry hash
FILE REMOVED
In registry, not in storage
→ Delete vectors
→ Mark status = deleted
→ Audit trail preserved
HASH MATCH
Content unchanged
SKIP — zero API calls
→ Zero embedding cost
→ Zero quota consumed
Result: Multiple re-indexing cycles — zero orphaned vectors. Registry always in sync with Qdrant. Idempotent — safe to run twice.

GDPR Compliant · Zero PII Leakage · Multi-Layer Defense

01
PII Masking — Microsoft Presidio + spaCy
Aadhaar · Phone (+91[6-9]\d{9}) · Email · Name detected and masked BEFORE hitting Qdrant API. Same semantic result, zero PII exposure.
02
Google OAuth 2.0 + JWT
Scope: openid email profile only — no Gmail access. JWT: HS256, 7-day expiry. Admin check via env variable. Frontend useCallback prevents auth re-render loops.
03
Rate Limiting + Circuit Breaker
SlowAPI: 5 req/min per IP. CircuitBreaker: 10 failures → OPEN 120s → HALF-OPEN → retry. Prevents cascading OpenRouter failures.
04
GDPR — MongoDB 30-Day TTL
Article 5(1)(e): Data minimization. expireAfterSeconds=2592000. Auto-delete, no cron job. Abusive queries NOT saved to MongoDB — no pollution.
05
Vector Isolation — Qdrant Payload Filters
3-field filter: source_file + is_temporary + uploaded_by=email. Enforced at DB level. User A cannot access User B's documents even with same filename.

Real Numbers from Live Production

31,528
Total Chunks
28,352
Child Vectors in Qdrant
3,176
Parent Chunks for LLM
768
Jina Dimensions
Sub-5s
Response Latency
₹0
Monthly Cost

100% Free Tier · Production Deployed

ORCHESTRATION
LangGraph StateGraph
LangChain
Langfuse Observability
VECTOR DB
Qdrant Cloud (1GB free)
Jina AI Embeddings
Parent-Child Chunking
BACKEND
FastAPI + Uvicorn
Docker on Render
React + Vite (Vercel)
SECURITY
Microsoft Presidio PII
Google OAuth 2.0
SlowAPI + Circuit Breaker
DATABASES
MongoDB Atlas
Supabase Postgres
Upstash Redis
LLM
Qwen 3 235B (OpenRouter)
temp=0.3 anti-hallucination
Confidence gate <40%