Enterprise RAG System for HR Knowledge Management

February 18, 2026 · 1 min · 104 words · Pushp Kharat | Suggest Changes

PKBoost AI Labs | Dec 2025 – Present

Lead systems engineer for an end-to-end RAG architecture, built for performance and scale.

Performance at Scale

Throughput: 100,000+ queries/second.
Latency: <5ms vector search, <300ms end-to-end response time.
Improvement: 10–100x faster than PostgreSQL pgvector baselines.

Technical Architecture

Stack: Rust (Axum), Tokio, USearch (in-memory HNSW), FastEmbed-rs, PostgreSQL, React.
Security: JWT auth, Argon2 hashing, rate limiting, SQL injection protection.
Ingestion: Multi-format pipeline (PDF, Excel, Word) with OCR and semantic chunking.
Deployment: Single-binary deployment (<50MB) handling 1,000+ concurrent connections.

Real-World Deployment

Deployed at a Fortune 500 company (Under NDA) supporting 1,000+ employees with <5ms semantic search across 10,000+ document chunks.