Job description
THE ROLE We're looking for a Senior MLOps Engineer who can set the standard for how we build, ship, and operate ML and AI systems at scale. You sit at the intersection of ML infrastructure and SRE - you'll own the path from model and pipeline to reliable production service, and you'll bring DevOps rigor to systems that are historically under-engineered. This is not a ticket-processing role, and it's not a research role. You'll tackle hard problems - model serving reliability, inference cost and latency, reproducible pipelines, agentic workload operations - and have the scope to solve them properly. Seniors here identify problems before they're asked, and raise the ceiling on what the platform can do.
WORK ON Build and operate model and inference serving infrastructure - managing latency, throughput, autoscaling, and reliability for real-time and batch inference across multiple tenants. Own the ML deployment lifecycle - model registry, versioning, promotion workflows, rollout strategies (canary, shadow, A/B), and safe rollback. Operate agentic and LLM workloads in production - managing inference providers and gateways, quota and throttling behavior (TPS/TUPS limits), guardrails, prompt/version management, and graceful degradation under load. Build reproducible, automated ML pipelines - training, evaluation, and deployment pipelines as code, with lineage and reproducibility built in. Extend infrastructure-as-code to ML systems - Terraform patterns and multi-account design that bring ML infrastructure under the same standards as the rest of the platform. Operate GitOps for ML workloads - ArgoCD configuration and promotion workflows across environments and tenants. Run ML and AI workloads on multi-tenant Kubernetes (AWS EKS) - managing GPU/accelerator scheduling, workload placement, tenant isolation, and cost-aware capacity. Own ML reliability and observability - SLOs for inference services, model and data drift detection, performance regression monitoring,
Browse more remote jobs in Remote job on Meridian.