Bestkaam Logo
Omnidya India Logo

Senior DevOps Engineer

Ahmedabad, Ahmedabad, India

3 weeks ago

Applicants: 0

Salary Not Disclosed

2 days left to apply

Job Description

Job Title: DevOps Engineer Location: Ahmedabad? Department: Engineering & Infrastructure Reports To: CTO? ________________________________________ About Omnidya Tech LLP, Hello Omnidya is building India?s first advanced AI-powered dashcam ecosystem for fleet management, safety analytics, and smart transportation. Our platform fuses edge AI processing (ADAS, DMS, ANPR, telematics) with secure cloud connectivity (AWS IoT, S3, MQTT, and real-time streaming). We are seeking a DevOps Engineer to scale our infrastructure, automate build and deployment pipelines, and manage GPU-based AI compute clusters both on-premise and in the cloud. ________________________________________ Role Overview As a DevOps Engineer, you will play a crucial role in automating deployments, managing distributed edge-cloud systems, and maintaining our GPU training and inference environments. You?ll work closely with the AI, firmware, and backend teams to ensure smooth CI/CD workflows, optimal GPU utilization, and high system reliability. ________________________________________ Key Responsibilities ?? CI/CD & Automation ? Design, build, and maintain CI/CD pipelines using GitLab CI, Jenkins, or GitHub Actions for backend, AI, and firmware builds. ? Automate testing and deployment for Yocto-based embedded systems? ? Create Docker containers and deployment scripts for AI inference and cloud microservices. ?? Cloud & Infrastructure Management ? Manage and scale AWS infrastructure (IoT Core, EC2, ECR, CloudWatch, Lambda, Route 53). ? Set up and maintain Terraform or CloudFormation for Infrastructure as Code (IaC). ? Implement robust monitoring, alerting, and log aggregation using Prometheus, Grafana, ELK, or CloudWatch. ?? GPU Rack & Compute Cluster Management ? Manage on-premise GPU servers / AI training racks (Ubuntu-based, multi-GPU systems). ? Configure, optimize, and monitor GPU utilization for PyTorch / TensorFlow workloads. ? Handle CUDA driver updates, containerized training environments, and model deployment pipelines. ? Automate job scheduling using Slurm, Docker Swarm, or Kubernetes for GPU workloads. ? Monitor performance metrics (GPU load, memory, thermals, power usage) to ensure stable training and inference operations. ?? Device Integration & Fleet Management ? Streamline OTA (Over-The-Air) update pipelines for connected edge devices. ? Manage provisioning, authentication, and status monitoring of thousands of IoT devices. ? Ensure robust MQTT, REST API, and video data sync between dashcams and the cloud. ?? Security & Compliance ? Implement AWS IAM policies, TLS/SSL certificates, and secure OTA mechanisms. ? Collaborate on device and cloud-level security hardening for regulatory compliance (BIS, ICAT). ?? Documentation & Collaboration ? Document automation flows, deployment topologies, and infrastructure standards. ? Collaborate with AI, embedded, and backend teams to align deployment processes across systems. ________________________________________ Required Skills & Experience ?? Experience ? 3?7 years of experience in DevOps, Cloud Infrastructure, or Site Reliability Engineering. ??? Technical Skills ? Linux system administration (Ubuntu, Yocto, Debian) ? Containerization: Docker, Podman, Kubernetes (preferably K3s / MicroK8s) ? CI/CD Tools: GitLab CI, Jenkins, GitHub Actions ? Cloud Platforms: AWS (EC2, IoT Core, S3, Lambda, CloudWatch) ? IaC: Terraform, CloudFormation ? Monitoring: Prometheus, Grafana, ELK Stack ? Networking: VPN, DNS, load balancing, NAT, SSL certificates ? GPU Systems: o Hands-on with NVIDIA GPU drivers, CUDA, cuDNN, TensorRT o Experience with GPU workload management, thermal/power profiling, and optimization o Familiarity with multi-GPU training, inference scaling, and model deployment ?? Bonus Skills ? Experience with embedded Linux (Yocto, NXP) ? Understanding of RTMP/FLV streaming pipelines or GStreamer ? Familiarity with Python microservices (FastAPI / Flask) ? Knowledge of AI/ML model lifecycle management (training ? quantization ? edge inference) ________________________________________ Soft Skills ? Strong analytical and problem-solving mindset. ? Excellent communication and cross-functional collaboration. ? Passion for automation, reliability, and scalability. ? Ability to work independently in a fast-paced startup environment. ________________________________________ What We Offer ? Competitive salary and performance-based bonuses. ? Opportunity to work on cutting-edge edge-AI + GPU infrastructure projects. ? Exposure to AWS, IoT, AI training clusters, and fleet-scale deployment systems. ? Hybrid work setup and rapid growth opportunities in a high-impact product team.

Additional Information

Company Name
Omnidya India
Industry
N/A
Department
N/A
Role Category
N/A
Job Role
Mid-Senior level
Education
No Restriction
Job Types
On-site
Gender
No Restriction
Notice Period
Less Than 30 Days
Year of Experience
1 - Any Yrs
Job Posted On
3 weeks ago
Application Ends
2 days left to apply

Similar Jobs

GE Vernova

2 months ago

Staff Software Engineer - DevOps

GE Vernova

Nextiva

2 months ago

Senior Site Reliability Engineer (Middleware)

Nextiva

Arcitech

3 weeks ago

Sr. Python AI Developer

Arcitech

Teradata

3 weeks ago

Staff AI Engineer

Teradata

LIXIL

3 weeks ago

Systems Engineer - Global IT Operations Center

LIXIL

Uplers

3 weeks ago

Senior Cloud & DevOps Engineer

Uplers

Ciena

3 weeks ago

QA Engineer/Module Lead - Routing & switching testing + Python Automation

Ciena

Intuition IT ? Intuitive Technology Recruitment

1 month ago

Drupal Developer (Contract)

Intuition IT ? Intuitive Technology Recruitment

Drupal, HTML, CSS +2
Accenture in India

3 weeks ago

Custom Software Engineer

Accenture in India

Accenture services Pvt Ltd

1 month ago

Data Platform Engineer

Accenture services Pvt Ltd