AI / GenAI Engineer

Jayant Khandebharad

I build production-grade GenAI systems end-to-end — agentic platforms, RAG pipelines, voice AI, and LLMOps on Azure.

~/jayant — zsh

$ whoami
→ AI Engineer · end-to-end GenAI systems

$ stack --top
→ python · fastapi · langgraph
azure · rag · litellm · pgvector

$ focus
→ reliability · guardrails · cost

3+ yrs

building production GenAI

10K+

documents indexed (GraphRAG)

+35%

retrieval precision

400+

calls/mo handled by voice AI

−22%

cloud cost

99.9%

uptime

// selected work

Things I've built

Agentic AI · Multi-tenant Platform

Agentic AI Platform for Legal Workflows

A multi-tenant platform where every legal matter has an AI assistant — a supervisor agent delegates to specialist sub-agents and offloads long-running automation to a worker tier.

Supervisor + sub-agents
Strict per-tenant isolation
Queue-backed workers

PythonFastAPIMulti-agent orchestrationAzure Service BusPostgreSQL + pgvector+5

Built at CloudLexView case study →

RAG · Retrieval

GraphRAG over Legal Matters

A graph-aware retrieval pipeline that indexes 10K+ legal documents and answers case questions in roughly two seconds.

10K+ docs indexed
+35% retrieval precision
−40% latency

PythonLiteLLMCosmos DBEmbeddings & vector searchFastAPI+1

Built at CloudLexView case study →

Voice AI · Event-driven Backend

AI Voice CallBot (PSTN)

Autonomous phone agents that place and answer real PSTN calls, deliver case updates, and hold a natural back-and-forth conversation.

400+ calls/mo autonomous
−60% response time
Zero dead-air

Azure Communication ServicesAzure Functions (Python)Azure OpenAI (GPT-4o)Azure Speech (STT/TTS, SSML)Redis+2

Built at CloudLexView case study →

Full-stack · RAG

Chronological Medical Summary Generator

Type a case ID and get a dated, row-per-encounter clinical timeline extracted from scanned medical records — exportable as a branded PDF or Word document.

End-to-end RAG → timeline
User-defined columns
Branded PDF / DOCX

FastAPILangChain (StructuredOutputParser)Azure OpenAI (GPT-4o)Azure AI SearchReact+3

Built at CloudLexView case study →

Conversational AI · Full-stack

Embeddable AI Intake Assistant

An embeddable chat widget that drops onto any law-firm website with one script tag — qualifying new leads and answering existing clients’ case questions.

One-script-tag embed
Multi-tenant
No secret in the browser

TypeScriptAzure Bot Framework v4Direct LineMicrosoft WebChatrestify (Node 20)+4

Built at CloudLexView case study →

Platform · LLMOps · DevOps

LLMOps & Deployment Platform

The delivery backbone for the GenAI products — multi-stage Azure pipelines, the right compute for each workload, fast rollbacks, and cost/latency observability.

−22% cloud cost
Zero-downtime rollouts
99.9% uptime

Azure DevOpsAzure Container AppsContainer App JobsAzure FunctionsDocker+3

Built at CloudLexView case study →

Foundations · PyTorch

Transformer LM from Scratch

A modern decoder-only Transformer built from first principles in PyTorch — so the production GenAI stack is never a black box.

BPE · RoPE · SwiGLU · RMSNorm
Pre-norm decoder
CS336 test contracts

PythonPyTorchNumPy

Personal · self-studyView case study →

All projects

// stack

What I work with

Languages

Python
TypeScript
SQL
C++

GenAI & Agentic

RAG & GraphRAG
Multi-agent orchestration
LangChain
LangGraph
LiteLLM
Prompt engineering
Evals & guardrails
Embeddings & vector search

LLM Providers

Azure OpenAI (GPT-4o)
Anthropic Claude
Google Gemini
Groq
Hugging Face

Voice & Document AI

Azure Communication Services
Azure Speech (STT/TTS, SSML)
Azure AI Search
OCR

Backend & Data

FastAPI
Azure Functions
Async workers & queues
Service Bus
PostgreSQL + pgvector
Cosmos DB
MySQL
Redis
REST APIs

Cloud, DevOps & LLMOps

Azure Container Apps
Container App Jobs
Docker
Azure DevOps CI/CD
Bicep
Application Insights / KQL
Cost & observability

Foundations

PyTorch
Transformers from scratch
Data Structures & Algorithms
System design

// experience

Where I've worked

Software Engineer II, Generative AI · CloudLex
Jan 2023 – Present
- Built a GraphRAG pipeline (LiteLLM + Cosmos DB on Azure) indexing 10K+ legal documents — improving retrieval precision ~35% and cutting latency ~40%.
- Automated model-deployment pipelines via Azure DevOps CI/CD, enabling zero-downtime rollouts across multi-tenant environments.
- Developed multiple LLM chatbots (helpdesk, multi-tenant intake, PIP/MVA) with FastAPI and LangChain, automating client intake and cutting manual workload ~50%.
- Created a natural-language voice/IVR system with Azure Speech and PSTN integration that handles 400+ monthly calls autonomously, reducing response time ~60%.
- Designed multi-agent LLM systems for summarization, document review, and task generation using state-machine orchestration and Redis caching.
- Optimized token usage, API throughput, and inference speed — cutting cloud costs ~22% — and maintained 99.9% uptime across the GenAI products.
Software Developer Intern · Eaton
Jun 2022 – Jul 2022
- Built and optimized Angular UI modules for customer management and standardized REST API integration, improving render performance.

More about me

// contact

Let's build something

I'm open to AI / GenAI engineering roles where I can own systems end-to-end. The fastest way to reach me is email.

jntkhandebharad@gmail.com

Jayant Khandebharad

Things I've built

Agentic AI Platform for Legal Workflows

GraphRAG over Legal Matters

AI Voice CallBot (PSTN)

Chronological Medical Summary Generator

Embeddable AI Intake Assistant

LLMOps & Deployment Platform

Transformer LM from Scratch

What I work with

Languages

GenAI & Agentic

LLM Providers

Voice & Document AI

Backend & Data

Cloud, DevOps & LLMOps

Foundations

Where I've worked

Software Engineer II, Generative AI · CloudLex

Software Developer Intern · Eaton

Let's build something