Bestkaam Logo
The IT Firm Logo

MLOps Engineer - Azure/Kubernetes

Actively Reviewing the Applications

The IT Firm

Bengaluru Full-Time 4–8 years
Posted 2 days ago Apply by June 11, 2026

Job Description

Description

Location : Bangalore (Work from Office / Hybrid)

Experience : 5 to 8 Years

Employment Type : Full-Time

About The Role

We are looking for a highly skilled Senior Ops/MLOps Engineer to drive the deployment, scalability, and operational excellence of GenAI, LLM, and Machine Learning workloads. This role requires deep expertise in Azure cloud ecosystem, Kubernetes platforms, and modern LLMOps practices.

You will play a critical role in building reliable, scalable, and production-grade AI platforms, enabling seamless deployment of ML models, Large Language Models (LLMs), and GenAI-based applications.

Key Responsibilities

  • CI/CD & Automation
  • Design, implement, and maintain CI/CD/CT pipelines for ML models, LLMs, and GenAI workloads
  • Automate end-to-end model lifecycle including build, test, deployment, and monitoring
  • Implement GitOps-based deployment strategies using modern tooling
  • Model Deployment & Platform Engineering
  • Deploy and operationalize :

i. Machine Learning models and custom LLMs

ii. AI agents and GenAI applications

  • Work with platforms such as Azure Databricks, MLflow, AKS, and ARO
  • Enable scalable and highly available model serving infrastructure
  • GenAI & LLM Ecosystem Integration
  • Integrate and manage GenAI services including:

i. Azure OpenAI / OpenAI APIs

ii. Hugging Face models

iii. Retrieval-Augmented Generation (RAG) pipelines

  • Work with vector databases such as FAISS, Pinecone, Chroma, etc.
  • Support development and deployment of both custom-built and pre-trained AI models/agents
  • Databricks & ML Platform Management
  • Manage and optimize:

i. Databricks Workspaces

ii. Clusters and compute resources

iii. MLflow Model Registry

iv. Job orchestration pipelines

  • Ensure efficient utilization and performance tuning
  • Kubernetes & Cloud Infrastructure
  • Own end-to-end lifecycle management of AKS / ARO clusters
  • Handle:

i. Cluster provisioning and scaling

ii. Networking and security configurations

iii. Helm-based deployments

iv. GitOps workflows

  • Ensure platform reliability and fault tolerance
  • Observability & Reliability Engineering
  • Implement robust monitoring and observability for AI/ML systems:

i. Model performance and latency

ii. Data drift and model drift

iii. System reliability and uptime

  • Establish alerting and incident response mechanisms
  • Security, Governance & Cost Optimization
  • Enforce cloud security best practices, IAM, and compliance policies
  • Implement governance frameworks for ML and AI workloads
  • Optimize infrastructure and cloud cost usage

Required Skills & Qualifications

  • Strong hands-on experience with :

i. Microsoft Azure (mandatory)

ii. Kubernetes (AKS/ARO)

iii. Azure Databricks & MLflow

  • Experience with :

i. LLMOps / MLOps practices

ii. RAG pipelines and vector databases (FAISS, Pinecone, Chroma, etc.)

  • Proficiency in :

i. Python and automation scripting

ii. CI/CD tools (GitHub Actions preferred)

  • Solid understanding of :

i. AI/ML system lifecycle and production deployments

ii. Distributed systems and cloud-native architecture

Good To Have Skills

  • Experience with GenAI frameworks (LangChain, LlamaIndex, etc.)
  • Exposure to Helm, GitOps tools (ArgoCD / Flux)
  • Familiarity with containerization (Docker)
  • Knowledge of model evaluation, prompt engineering, and fine-tuning

What Were Looking For

  • Strong problem-solving and analytical mindset
  • Ability to work in a fast-paced, innovation-driven environment
  • Experience working with cross-functional teams (Data Science, Engineering, DevOps)
  • Ownership mindset with focus on scalability and reliability

Why Join Us ?

  • Work on cutting-edge GenAI & LLM technologies
  • Opportunity to build scalable AI platforms from the ground up
  • Collaborative and growth-focused environment

(ref:hirist.tech)
Check Qualification

Quick Tip

Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.