Site Reliability Engineer
Actively Reviewing the ApplicationsApexon
Job Description
About the Company: Apexon is a digital-first technology services firm specializing in accelerating business transformation and delivering human-centric digital experiences. We have been meeting customers wherever they are in the digital lifecycle and helping them outperform their competition through speed and innovation. Apexon brings together distinct core competencies – in AI, analytics, app development, cloud, commerce, CX, data, DevOps, IoT, mobile, quality engineering and UX, and our deep expertise in BFSI, healthcare, and life sciences – to help businesses capitalize on the unlimited opportunities digital offers. Our reputation is built on a comprehensive suite of engineering services, a dedication to solving clients’ toughest technology problems, and a commitment to continuous improvement. Backed by Goldman Sachs Asset Management and Everstone Capital, Apexon now has a global presence of 15 offices (and 10 delivery centers) across four continents. We enable #HumanFirstDigital
About the Role: Lead Site Reliability Engineer (SRE)
Responsibilities:
Reliability Engineering & Architecture
- Define and implement SLIs, SLOs, and error budgets with product and engineering teams
- Conduct reliability reviews for new and existing services
- Design scalable, fault-tolerant architectures in AWS and Azure environments
- Lead capacity planning, performance and cost optimisation initiatives
- Improve system resilience through automation and self-healing patterns
- Drive organisational observability maturity (metrics, logs, traces, alert quality)
Incident Management & Continuous Improvement
- Perform complex root cause analysis and drive rapid mitigation
- Participate in blameless postmortems and follow-through
- Improve MTTR, reduce incident frequency, and elevate production standards
- Collaborate seamlessly with engineering teams to enable timely and effective resolutions
- Handle requests and incidents, create and maintain runbooks
- Participation in a structured 24*7 on-call rotation
Automation & Platform Engineering
- Reduce operational toil through tooling and automation (Python or similar)
- Improve CI/CD reliability and deployment safety mechanisms
- Build and maintain infrastructure-as-code (Terraform or equivalent)
- Enhance Kubernetes platform reliability (EKS, AKS, or similar)
Cross-Functional Leadership
- Partner with business, engineering, security, and cloud teams to embed reliability early in the software development life cycle
- Mentor mid-level engineers and help shape SRE best practices
- Champion a culture of ownership, accountability, and continuous improvement
Qualifications:
- Minimum Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent professional experience.
- Experience in either site reliability engineering, software engineering or related fields with production on-call experience.
- Solid experience with AWS and/or Azure, including setting up, monitoring, and maintaining cloud resources (incl. Kubernetes, EKS, AKS, GKE, etc.).
- Proficiency with observability tools
- Hands-on experience with incident management tools
- Proficiency in scripting languages for automation purposes
- Demonstrated proficiency in troubleshooting, especially in cloud and distributed system environments
- Excellent communication, teamwork and documentation skills, with a proactive and self-motivated approach to improving system reliability and operational efficiencies.
- We value and encourage candidates from diverse backgrounds and experiences, believing that diverse perspectives drive innovation and success.
- Excelling in both spoken and written English communication.
Our Commitment to Diversity & Inclusion: Did you know that Apexon has been Certified™ by Great Place To Work®, the global authority on workplace culture, in each of the four regions in which it operates: USA (for the seventh time in 2026), India (for the tenth consecutive time in 2026), the UK (for the fourth time in 2026) and Mexico (for the second time in 2026). Apexon is committed to being an equal opportunity employer and promoting diversity in the workplace. We take affirmative action to ensure equal employment opportunity for all qualified individuals. Apexon strictly prohibits discrimination and harassment of any kind and provides equal employment opportunities to employees and applicants without regard to gender, race, color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law. You can read about our Job Applicant Privacy policy here.
Our Commitment to Environment: Actively contribute to Apexon's commitment to environmental responsibility by following sustainable practices and supporting ESG initiatives.
Our Perks and Benefits: Our benefits and rewards program has been thoughtfully designed to recognise your skills and contributions, elevate your learning/upskilling experience and provide care and support for you and your loved ones. As an Apexon Associate, you get continuous skill-based development, opportunities for career advancement, and access to comprehensive health and well-being benefits and assistance. We also offer:
- Group Health Insurance covering a family of 4
- Term Insurance and Accident Insurance
- Paid Holidays & Earned Leaves
- Paid Parental Leave
- Learning & Career Development
- Employee Wellness
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Photo Editing & Retouching Specialist
ZF Group
Golang Developer II 4 TO 8 YEARS II BANGALORE
Capgemini
Compliance Engineer/Associates
QIMA
Remote Rust Engineer
Turing
Store Manager Electronics Retails Store
Reliance Retail
Share
Quick Apply
Upload your resume to apply for this position