Site Reliability Engineer
Actively Reviewing the ApplicationsHireAlpha
Bengaluru, Karnataka, India
Full-Time
On-site
Posted 3 months ago
•
Apply by May 4, 2026
Job Description
Job Description- Site Reliability Engineer
Experience- 8+ Years
Responsibilities
:
Ensure high availability, performance, and scalability of mission-critical systems and services.
Lead the design and implementation of resilient and fault-tolerant infrastructure.
Drive incident response, root cause analysis, and postmortem culture. Mentor others in incident practices.
Write and maintain operational documentation, runbooks, and architecture diagrams.
Drive and promote protocols on production readiness and operational excellence.
Own and evolve infrastructure automation using Terraform or similar tools to remove as much as possible any human intervention.
Help automate infrastructure provisioning and other engineering processes by working on automations built on top of an engineering platform written in GitHub Actions.
Build internal platforms, tools, and frameworks to improve developer productivity and service reliability.
Work closely with software engineers, platform teams, and product managers to align on company goals.
Coach and up-skill other engineering team members
Skills and Qualifications:
8?12+ years in SRE, DevOps, or related infrastructure-focused roles.
Understand large-scale complex systems from a reliability perspective.
Design, implement and maintain processes and tools.
Passion for producing clean, standards-compliant, secure code.
Bringing a developer mindset and applying it to infrastructure
Strong experience with Linux/Unix systems.
Deep experience with Kubernetes.
Deep experience with tools like Terraform, Ansible, Helm.
Strong coding skills in scripts for automating the execution of certain tasks with a programming language like Python, Bash or any other scripting language.
Experience with at least one relational and non-relational databases (ex: PostgreSQL, MySQL, MongoDB, Redis, ElasticSearch).
Ability to identify time consuming and error prone manual tasks and then build/leverage tooling to automate them.
Ability to identify root causes of instability in a large-scale distributed system across stacks.
Experience leading high-severity incident responses and postmortems
Nice to haves / Pluses:
Experience with cloud-based solutions such as Amazon AWS, Google Cloud, or Microsoft Azure.
Experience supporting scalable DBs like PostgreSQL, or MongoDB in production.
Understanding of cost
Required Skills
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Python QA Engineer (Bengaluru - 7+ Years of Experience )
PaasWise
Bengaluru
Full-Time
Python
Backend Developer - Django
Cyces Innovation Labs LLP
India
Full-Time
₹5–16 LPA
Django
MySQL
PostgreSQL
+5
Python Fullstack Engineer - Remote Work
BairesDev
India
Full-Time
₹1–9 LPA
Python
Pandas
Cloud
+2
Data Engineer-Data Platforms-AWS
IBM
Pune
Full-Time
Python
Hadoop
Apache Spark
+2
Senior Software Engineer – Python & ReactJS
EPAM Systems
India
Full-Time
₹15–30 LPA
JavaScript
Python
TypeScript
+6
Share
Quick Apply
Upload your resume to apply for this position