Bestkaam Logo
PwC India Logo

Site Reliability Engineer

Actively Reviewing the Applications

PwC India

Bengaluru Full-Time 4–8 years
Posted 2 days ago Apply by June 11, 2026

Job Description

Opportunity

We are looking for SREs who want to define what reliability means for the next generation of industrial software. Defining SLIs/SLOs, building observability platforms, and establishing incident management processes.


Responsibilities

  • Define and implement SLI/SLO frameworks for complex engineering systems across manufacturing and industrial clients
  • Design and deploy observability platforms using Prometheus, Grafana, and Datadog
  • Establish incident management processes and lead blameless post-mortems
  • Implement chaos engineering practices to proactively identify system weaknesses
  • Drive toil elimination through automation and platform improvements
  • Build reliability engineering capabilities within the practice and client organisations


Essential Skills

  • SLI/SLO definition and implementation at enterprise scale
  • Observability: Prometheus, Grafana, Datadog, New Relic
  • Incident management and post-mortem facilitation
  • Chaos engineering: Gremlin, Chaos Monkey, Litmus
  • Python testing for reliability validation and automated runbooks
  • Automation and scripting: Python, Go, Bash
  • Cloud platforms: AWS, Azure, GCP


Experience

5–10 years in SRE or Production Engineering roles with experience in enterprise or industrial environments

Required Skills

Check Qualification

Quick Tip

Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.