Site Reliability Engineer

Actively Reviewing the Applications

Fournine Cloud

India, Telangana Full-Time On-site

Posted 2 days ago • Apply by June 14, 2026

Job Description

Company Description

Fournine Cloud, founded in 2016, is a trusted partner specializing in cloud transformation and DevOps solutions for businesses in fintech, telecommunications, and enterprise sectors. As an Advanced Consulting Partner with AWS and GCP, we deliver expertise in cloud migration, Kubernetes orchestration, and DevOps automation. Our certified team provides end-to-end services, from consulting and architectural design to implementation and ongoing support. We are committed to helping clients achieve measurable results like cost reductions, zero-disruption migrations, and scalable cloud-native solutions. Collaborate with us to accelerate your digital transformation journey.

Site Reliability Engineer (SRE) – 4+ Years Experience

Site Reliability Engineer responsible for maintaining the reliability, availability, and performance of cloud-based applications and infrastructure. Works closely with development and platform teams to support production systems, monitor platform health, resolve incidents, and improve operational efficiency. Contributes to DevOps practices by supporting deployments, maintaining CI/CD pipelines, and automating operational tasks to improve system stability and reduce manual effort.

Responsibilities

Monitor production systems and services to ensure high availability and performance.
Analyse alerts, metrics, and logs to detect and troubleshoot system issues.
Respond to incidents and production alerts, ensuring timely resolution and minimal service disruption.
Perform root cause analysis (RCA) and collaborate with engineering teams to implement permanent fixes.
Monitor and support Kubernetes workloads including pods, deployments, and stateful services.
Maintain monitoring dashboards and configure alerting rules to improve system observability.
Assist with application deployments and release activities through CI/CD pipelines.
Troubleshoot issues across infrastructure, application, and networking layers.
Support cloud infrastructure operations and ensure efficient resource utilization.
Contribute to automation of operational and deployment tasks.
Participate in on-call rotations and operational support activities.
Maintain runbooks, operational documentation, and troubleshooting guides.

Skills

Linux administration, Google Cloud Platform (GCP) or similar cloud platforms, Kubernetes, Docker, CI/CD pipelines, Prometheus, Grafana, Kibana, cloud monitoring and logging tools, incident management, production monitoring, log analysis, troubleshooting distributed systems, scripting with Bash or Python, networking fundamentals (DNS, TCP/IP, ports, load balancing), DevOps practices, infrastructure support, and system reliability engineering concepts.