MLOps Engineer
Actively Reviewing the ApplicationsYotta Data Services Private Limited
Mumbai, Maharashtra, India
Full-Time
Posted 4 months ago
•
Apply by May 4, 2026
Job Description
About the Role:
We?re looking for a strategic Senior MLOps Engineer to lead the end-to-end design, implementation, and scaling of our AI infrastructure. You?ll partner with researchers, product teams, and DevOps to turn prototypes into production services that meet strict SLAs for latency, reliability, and cost efficiency.
Responsibilities:
?
Core MLOps Pipelines
: Design and implement scalable ML pipelines (training, evaluation, deployment) for LLMs, CV, and multimodal models .
?
Model Serving & CI/CD
: Lead efforts in model serving, versioning, automated CI/CD, and real-time monitoring of AI workflows .
?
Inference-as-a-Service
: Build and optimize GPU-backed serving infrastructure targeting p99 latency < 100 ms, 99.9% uptime, and > 80% GPU utilization .
?
Governance & Drift Detection
: Drive initiatives on model governance, automated drift detection (?10% false positives), and data-management best practices .
?
Vector Search & Agent Orchestration
: Integrate vector databases (Qdrant, Pinecone) for low-latency semantic retrieval, and build agentic workflows using LangChain or similar frameworks.
?
Enterprise Multi-Tenancy
: Architect RBAC-driven, isolated ML services to securely serve 100?500+ organizations.
Observability & Logging
: Design Prometheus/Grafana dashboards, ELK/Fluentd logging pipelines, and alerting for all ML workloads.
?
CI/CD for Inference APIs
: Maintain CI/CD pipelines for Python (FastAPI) and TypeScript (NestJS) inference services.
?
Metrics & Cost Optimization
: Define and track SLAs/SLOs, optimize cloud spend by ? 20% year-over-year, and ensure GPU clusters operate at > 80% utilization.
?
Cross-Functional Leadership
: Partner with AI researchers, product managers, and legal to align MLOps standards with compliance and roadmap goals.
?
Mentorship & Community
: Mentor junior engineers, run quarterly brown-bags, own onboarding docs (upskill 5+ engineers/quarter), and publish ? 1 open-source contribution or talk annually.
Requirements
:
? 9?15 years in software engineering, including
? 4 years
in MLOps or ML infrastructure
? Strong expertise in cloud platforms (AWS/GCP/Azure), Kubernetes, Docker, Terraform, Helm, Kubeflow, and MLflow
? Experience with inference frameworks (Triton, TensorFlow Serving, BentoML, TorchServe)
Familiarity with distributed training, workload schedulers, and GPU-cluster orchestration
? Proficiency in Python, TypeScript, and infrastructure-as-code (Terraform, Helm, etc.)
? Proven track record building reliable, scalable ML systems in production.
Plus These Critical Skills:
? Vector DB integration (Qdrant, Pinecone)
? Agent orchestration (LangChain, LlamaIndex)
? Multi-tenant security and RBAC
? Observability stacks (Prometheus/Grafana, ELK)
? CI/CD for FastAPI/NestJS services
Minimum Qualification:
Bachelor's or Master's Degree in Computer Science, Engineering, or a related field.
Preferred
:
?Prior experience at AI-focused startups or enterprises scaling ML for 100?500 orgs.
? Understanding of low-latency streaming inference or agent-based LLM systems.
? Excellent written and verbal communication, and a proven ability to drive consensus across functions.
Required Skills
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Network & Automation Engineer
frenzoft
India
Full-Time
Python
Ansible
CAD Engineers
Expleo Group
India
Full-Time
Automobile Engineering
Mechanical
Quality Control Engineer
MSR-FSR, LLC
India
Full-Time
₹1–1 LPA
Engineering
Mechanical Engineering
Full Stack Developer
DeepArc Tech
India
Full-Time
MySQL
PostgreSQL
MongoDB
+20
Customer Support Executive
Aramya
India
Full-Time
Google Sheets
Share
Quick Apply
Upload your resume to apply for this position