Bestkaam Logo
UST Logo

DevOps/ SRE Lead - Bash/ Python/ Go, Docker, Kubernetes, Helm, Cloud, Prometheus/ Grafana

Bengaluru, Karnataka, India

1 month ago

Applicants: 0

Salary Not Disclosed

1 month left to apply

Job Description

Role Description Job Summary: We are seeking an experienced Site Reliability Engineer (SRE) with advanced DevOps expertise to help build, scale, and maintain our infrastructure and services. You will play a critical role in ensuring high availability, performance, scalability, and security of our production systems, while enabling continuous deployment and rapid delivery of features to our customers. Key Responsibilities Design, build, and maintain reliable, scalable, and secure cloud-based infrastructure (AWS, Azure, or GCP). Develop and improve observability using monitoring, ing, logging, and tracing tools (e.g., Prometheus, Grafana, ELK, Datadog, etc.). Automate repetitive tasks and infrastructure using Infrastructure-as-Code (Terraform, CloudFormation, Pulumi). Create and maintain CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.) to support fast and safe delivery. Lead incident response, root cause analysis, and postmortems to ensure high uptime and rapid recovery. Optimize system performance, reliability, and cost-effectiveness through proactive monitoring and tuning. Collaborate with software engineering teams to define SLAs/SLOs and improve service reliability. Implement and maintain security best practices across environments (e.g., secrets management, IAM, firewalls, etc.). Maintain disaster recovery plans, backups, and high-availability strategies. Qualifications Required: 7+ years of experience as an SRE, DevOps Engineer, or similar role. Proficiency in scripting and automation (Bash, Python, Go, etc.). Strong experience with containerization and orchestration (Docker, Kubernetes, Helm). Solid understanding of Linux systems administration and networking fundamentals. Experience with cloud platforms (AWS, Azure, or GCP). Experience with IaC tools like Terraform or CloudFormation. Familiarity with GitOps and modern deployment practices. Hands-on experience with observability tools (e.g., Prometheus, Grafana, Datadog). Strong troubleshooting and incident response skills. Preferred Experience in a high-traffic, microservices-based architecture. Exposure to service meshes (Istio, Linkerd). Certifications (AWS Certified DevOps Engineer, CKA, etc.) Experience with security automation and compliance (e.g., SOC2, ISO27001). Soft Skills Strong communication and collaboration abilities. Ability to thrive in a fast-paced, agile environment. Analytical mindset and proactive approach to problem-solving. A passion for automation, performance, and system design. Skills DevOps/ SRE, Bash/ Python/ Go, Docker, Kubernetes, Helm, Cloud, Prometheus/ Grafana

Additional Information

Company Name
UST
Industry
N/A
Department
N/A
Role Category
SRE (Site Reliability Engineer)
Job Role
Mid-Senior level
Education
No Restriction
Job Types
On-site
Gender
No Restriction
Notice Period
Less Than 30 Days
Year of Experience
1 - Any Yrs
Job Posted On
1 month ago
Application Ends
1 month left to apply

Similar Jobs

IPolarity

3 weeks ago

Azure Clould

IPolarity

CleverTap

3 weeks ago

Senior DevOps Engineer

CleverTap

Avigna AB

3 weeks ago

Senior Site Reliability Engineer

Avigna AB

Miratech

1 month ago

Network Engineer - WAN

Miratech

Christy Media Solutions

1 month ago

DevOps Engineer

Christy Media Solutions

Accolite

1 month ago

L2 .Net Support Engineer

Accolite

Tesco Bengaluru

3 weeks ago

System Engineer I - TRCS Configuration

Tesco Bengaluru

PwC Acceleration Center India

1 month ago

Azure Data developer - Senior Associate

PwC Acceleration Center India

Zenoti

1 month ago

Senior Software Engineer (.NET Fullstack)

Zenoti

iSoftStone

1 month ago

Site Reliability Engineer

iSoftStone