Platform · LLMOps · DevOps

LLMOps & Deployment Platform

The delivery backbone for the GenAI products — multi-stage Azure pipelines, the right compute for each workload, fast rollbacks, and cost/latency observability.

−22% cloud cost
Zero-downtime rollouts
99.9% uptime

Azure DevOpsAzure Container AppsContainer App JobsAzure FunctionsDockerBicepApplication Insights / KQLLiteLLM

Built at CloudLex

// problem

The problem

Several GenAI services needed to ship to multiple environments many times a day without downtime, and LLM spend needed to be visible and controllable.

// approach

What I built

Parameterized multi-stage pipelines build and deploy each component independently — HTTP APIs to Container Apps, batch/indexing jobs to Container App Jobs, lightweight triggers to Functions.
Every build is immutably tagged, so a rollback is a one-command revision switch.
Token usage and latency are tracked per feature via the model gateway and Application Insights/KQL, keeping cost in check.

// architecture

How it fits together

Git pushMulti-stage Azure DevOps pipelineBuild + immutable image tagContainer Apps / Jobs / FunctionsNew revision (instant rollback)App Insights + KQL observability

// decisions

Key technical decisions

Right compute primitive per workload

HTTP APIs on Container Apps (autoscale on concurrency), batch jobs on Container App Jobs (scale to zero between runs), short triggers on Functions — optimizing both cost and cold-start.

Immutable tags + revisions

Deploying by immutable build id means every release is a new revision, and rollback is instant and safe.

// outcomes

Outcomes

~22% lower cloud cost
Zero-downtime, component-by-component rollouts
99.9% uptime maintained

Proprietary — source not public.

Want to talk through any of this?

jntkhandebharad@gmail.com