Skip to content
All projects

LLMOps & Deployment Platform

The delivery backbone for the GenAI products — multi-stage Azure pipelines, the right compute for each workload, fast rollbacks, and cost/latency observability.

  • −22% cloud cost
  • Zero-downtime rollouts
  • 99.9% uptime
Azure DevOpsAzure Container AppsContainer App JobsAzure FunctionsDockerBicepApplication Insights / KQLLiteLLM

Built at CloudLex

The problem

Several GenAI services needed to ship to multiple environments many times a day without downtime, and LLM spend needed to be visible and controllable.

What I built

  • Parameterized multi-stage pipelines build and deploy each component independently — HTTP APIs to Container Apps, batch/indexing jobs to Container App Jobs, lightweight triggers to Functions.
  • Every build is immutably tagged, so a rollback is a one-command revision switch.
  • Token usage and latency are tracked per feature via the model gateway and Application Insights/KQL, keeping cost in check.

How it fits together

Git pushMulti-stage Azure DevOps pipelineBuild + immutable image tagContainer Apps / Jobs / FunctionsNew revision (instant rollback)App Insights + KQL observability

Key technical decisions

Right compute primitive per workload

HTTP APIs on Container Apps (autoscale on concurrency), batch jobs on Container App Jobs (scale to zero between runs), short triggers on Functions — optimizing both cost and cold-start.

Immutable tags + revisions

Deploying by immutable build id means every release is a new revision, and rollback is instant and safe.

Outcomes

  • ~22% lower cloud cost
  • Zero-downtime, component-by-component rollouts
  • 99.9% uptime maintained

Proprietary — source not public.

Want to talk through any of this?

jntkhandebharad@gmail.com