About me
AI Engineer with 3+ years building production Generative-AI systems end-to-end at CloudLex — agentic platforms, RAG, voice AI, and LLMOps on Azure. I design Python backends for LLM orchestration (async workers, queues, and distributed RAG pipelines) and own delivery across the full life cycle: design, development, deployment, and monitoring — with a focus on reliability, guardrails, and cost.
The thread through my work is owning systems end-to-end — I'm most useful when I can take something from a vague problem to a deployed, monitored service, and I care as much about guardrails, cost, and reliability as about the model itself.
End-to-end, for real
Design
System architecture, data models, and the agent / RAG design that decides whether a feature is reliable or flaky.
Build
Python backends for LLM orchestration — async workers, queues, APIs — plus the frontends when needed.
Ship
Docker, Azure Container Apps / Functions, CI/CD and IaC, so it deploys many times a day without downtime.
Operate
Observability, guardrails, evals, and cost/latency tuning — keeping the system honest in production.
Skills
Languages
- Python
- TypeScript
- SQL
- C++
GenAI & Agentic
- RAG & GraphRAG
- Multi-agent orchestration
- LangChain
- LangGraph
- LiteLLM
- Prompt engineering
- Evals & guardrails
- Embeddings & vector search
LLM Providers
- Azure OpenAI (GPT-4o)
- Anthropic Claude
- Google Gemini
- Groq
- Hugging Face
Voice & Document AI
- Azure Communication Services
- Azure Speech (STT/TTS, SSML)
- Azure AI Search
- OCR
Backend & Data
- FastAPI
- Azure Functions
- Async workers & queues
- Service Bus
- PostgreSQL + pgvector
- Cosmos DB
- MySQL
- Redis
- REST APIs
Cloud, DevOps & LLMOps
- Azure Container Apps
- Container App Jobs
- Docker
- Azure DevOps CI/CD
- Bicep
- Application Insights / KQL
- Cost & observability
Foundations
- PyTorch
- Transformers from scratch
- Data Structures & Algorithms
- System design
Experience
Software Engineer II, Generative AI · CloudLex
Jan 2023 – Present- Built a GraphRAG pipeline (LiteLLM + Cosmos DB on Azure) indexing 10K+ legal documents — improving retrieval precision ~35% and cutting latency ~40%.
- Automated model-deployment pipelines via Azure DevOps CI/CD, enabling zero-downtime rollouts across multi-tenant environments.
- Developed multiple LLM chatbots (helpdesk, multi-tenant intake, PIP/MVA) with FastAPI and LangChain, automating client intake and cutting manual workload ~50%.
- Created a natural-language voice/IVR system with Azure Speech and PSTN integration that handles 400+ monthly calls autonomously, reducing response time ~60%.
- Designed multi-agent LLM systems for summarization, document review, and task generation using state-machine orchestration and Redis caching.
- Optimized token usage, API throughput, and inference speed — cutting cloud costs ~22% — and maintained 99.9% uptime across the GenAI products.
Software Developer Intern · Eaton
Jun 2022 – Jul 2022- Built and optimized Angular UI modules for customer management and standardized REST API integration, improving render performance.
Education & achievements
Education
- B.E. Computer EngineeringJul 2019 – Jul 2023Pune Institute of Computer Technology (PICT)CGPA 9.08 / 10
- Higher Secondary (XII)Jun 2017 – May 2019Deulgaon Raja Junior College82.77%
Achievements
- Smart India Hackathon 2022 — National Finalist
- Solved 620+ LeetCode problems
- CodeChef rating 1801
- MHT-CET — 99.71 percentile (Top 1%)