Bestkaam Logo
Mobius by Gaian Logo

Mobius - DevOps Engineer - GPU

Actively Reviewing the Applications

Mobius by Gaian

Hyderabad, Telangana, India Full-Time On-site
Posted 3 months ago Apply by May 4, 2026

Job Description

Description About the Role : We are seeking an experienced DevOps Engineer to join our infrastructure team, with a strong focus on managing and optimizing GPU-based compute environments for machine learning and deep learning workloads. In this role, you will be responsible for the end-to-end infrastructure lifecyclefrom provisioning with Terraform/Ansible to deploying ML models using modern frameworks like Hugging Face and Ollama. Key Responsibilities Manage infrastructure using Terraform and Ansible Deploy and monitor Kubernetes clusters with GPU support (including NVIDIA drivers and H100 SXM integration) Implement and manage inferencing frameworks such as Ollama, Hugging Face, etc. Support containerization (Docker), logging (EFK), and monitoring (Prometheus/Grafana) Handle GPU resource scheduling, isolation, and scaling for ML/DL workloads Collaborate closely with developers, data scientists, and ML engineers to streamline deployments and performance Required Skill Set 5- 8 years of hands-on experience in DevOps and infrastructure automation Proven experience in managing GPU-based compute environments Strong understanding of Docker, Kubernetes, and Linux internals Familiarity with GPU server hardware and instance types Proficient in scripting with Python and Bash Good understanding of ML model deployment, inferencing workflows, and resource utilization/metering Nice To Have Experience with AI/ML pipelines Knowledge of cloud-native technologies (AWS/GCP/Azure) supporting GPU workloads Exposure to model performance benchmarking and A/B testing (ref:hirist.tech)
Check Qualification

Quick Tip

Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.