Indian Legal AI — Ambuj Kumar Tripathi

PRODUCTION AI SYSTEMS · FIELD DOCUMENTATION

Indian Legal AI Platform

Production-Grade RAG System — Architecture Deep Dive

LangGraph Qdrant FastAPI Presidio LLMOps GDPR

Ambuj Kumar Tripathi

GenAI Solution Architect · RAG Systems Specialist

31,528

Chunks Indexed

6

Indian Legal Acts

768

Vector Dimensions

₹0

Monthly Cost

The Problem

Indian Legal System is Inaccessible

5,000+

Legal Acts

India has thousands of central and state acts. Average citizen cannot access or understand them. Lawyers charge per query.

2023

IPC Replaced by BNS

Major criminal code overhaul. Old information is actively dangerous. Static fine-tuned models go stale immediately.

No

Simple Answer

Legal queries need source citations, exact section numbers, and context — not hallucinated summaries that could cause harm.

Solution → A production RAG system that retrieves exact legal provisions with source citations — and re-indexes in minutes when laws change.

Solution Overview

Production RAG · Zero Cost · Live Right Now

31,528

Chunks Indexed

Sub-5s

Response Latency

₹0

Monthly Infrastructure

LangGraph 6-Node State Machine

CLASSIFY → PII MASK → RETRIEVE → GENERATE → LOG

Parent-Child Chunking

400-char search · 2000-char LLM context · single Qdrant lookup

SHA-256 Sync Engine

Zero orphaned vectors · idempotent · hash-based change detection

GDPR Compliant

Presidio PII masking · MongoDB 30-day TTL · Circuit Breaker

Core Architecture

LangGraph 6-Node State Machine

USER

input

CLASSIFY

greeting/rag/abuse

RETRIEVE

PII → Qdrant

GENERATE

Qwen 3 235B

POST PROCESS

MongoDB+Langfuse

                            GREETING PATH

CLASSIFY → GREET → END
150 tokens · saves cost

                            ABUSIVE PATH

CLASSIFY → REJECT → END
NOT saved to MongoDB

                            RAG PATH

Full pipeline · Confidence gate
<40% → fallback, no LLM call

Core Innovation

Parent-Child Chunking Strategy

❌ Simple Chunking Problem

Small chunks (400 chars): Precise search but poor LLM context
Large chunks (2000 chars): Good context but imprecise retrieval

✓ Two-Level Solution

Search: Child chunks (400 chars, 50 overlap) in Qdrant
Retrieve: Parent text (2000 chars) sent to LLM

📄 Constitution.pdf — Parent Chunk #001 (2000 chars)

child #1
400c

child #2
400c

child #3
400c

child #4
400c

child #5
400c

📄 Constitution.pdf — Parent Chunk #002 (2000 chars)

child #6
400c

child #7
400c

child #8
400c

child #9
400c

child #10
400c

Query hits child → retrieves parent → LLM gets rich context
Result: 28,352 child vectors · 3,176 parent chunks · precision + context

Unique Engineering

SHA-256 Sync Engine — Zero Orphaned Vectors

NEW FILE

No registry entry found
→ Index to Qdrant
→ Create registry row
→ Upsert vectors

HASH MISMATCH

File content changed
→ Delete ALL old vectors
→ Re-index fresh
→ Update registry hash

FILE REMOVED

In registry, not in storage
→ Delete vectors
→ Mark status = deleted
→ Audit trail preserved

HASH MATCH

Content unchanged
→ SKIP — zero API calls
→ Zero embedding cost
→ Zero quota consumed

Result: Multiple re-indexing cycles — zero orphaned vectors. Registry always in sync with Qdrant. Idempotent — safe to run twice.

Enterprise Security

GDPR Compliant · Zero PII Leakage · Multi-Layer Defense

01

PII Masking — Microsoft Presidio + spaCy

Aadhaar · Phone (+91[6-9]\d{9}) · Email · Name detected and masked BEFORE hitting Qdrant API. Same semantic result, zero PII exposure.

02

Google OAuth 2.0 + JWT

Scope: openid email profile only — no Gmail access. JWT: HS256, 7-day expiry. Admin check via env variable. Frontend useCallback prevents auth re-render loops.

03

Rate Limiting + Circuit Breaker

SlowAPI: 5 req/min per IP. CircuitBreaker: 10 failures → OPEN 120s → HALF-OPEN → retry. Prevents cascading OpenRouter failures.

04

GDPR — MongoDB 30-Day TTL

Article 5(1)(e): Data minimization. expireAfterSeconds=2592000. Auto-delete, no cron job. Abusive queries NOT saved to MongoDB — no pollution.

05

Vector Isolation — Qdrant Payload Filters

3-field filter: source_file + is_temporary + uploaded_by=email. Enforced at DB level. User A cannot access User B's documents even with same filename.

Production Metrics

Real Numbers from Live Production

31,528

Total Chunks

28,352

Child Vectors in Qdrant

3,176

Parent Chunks for LLM

768

Jina Dimensions

Sub-5s

Response Latency

₹0

Monthly Cost

Technology Stack

100% Free Tier · Production Deployed

ORCHESTRATION

LangGraph StateGraph

LangChain

Langfuse Observability

VECTOR DB

Qdrant Cloud (1GB free)

Jina AI Embeddings

Parent-Child Chunking

BACKEND

FastAPI + Uvicorn

Docker on Render

React + Vite (Vercel)

SECURITY

Microsoft Presidio PII

Google OAuth 2.0

SlowAPI + Circuit Breaker

DATABASES

MongoDB Atlas

Supabase Postgres

Upstash Redis

LLM

Qwen 3 235B (OpenRouter)

temp=0.3 anti-hallucination

Confidence gate <40%