Site Reliability Engineer
Actively Reviewing the ApplicationsFournine Cloud
Job Description
Company Description
Fournine Cloud, founded in 2016, is a trusted partner specializing in cloud transformation and DevOps solutions for businesses in fintech, telecommunications, and enterprise sectors. As an Advanced Consulting Partner with AWS and GCP, we deliver expertise in cloud migration, Kubernetes orchestration, and DevOps automation. Our certified team provides end-to-end services, from consulting and architectural design to implementation and ongoing support. We are committed to helping clients achieve measurable results like cost reductions, zero-disruption migrations, and scalable cloud-native solutions. Collaborate with us to accelerate your digital transformation journey.
Site Reliability Engineer (SRE) – 4+ Years Experience
Site Reliability Engineer responsible for maintaining the reliability, availability, and performance of cloud-based applications and infrastructure. Works closely with development and platform teams to support production systems, monitor platform health, resolve incidents, and improve operational efficiency. Contributes to DevOps practices by supporting deployments, maintaining CI/CD pipelines, and automating operational tasks to improve system stability and reduce manual effort.
Responsibilities
- Monitor production systems and services to ensure high availability and performance.
- Analyse alerts, metrics, and logs to detect and troubleshoot system issues.
- Respond to incidents and production alerts, ensuring timely resolution and minimal service disruption.
- Perform root cause analysis (RCA) and collaborate with engineering teams to implement permanent fixes.
- Monitor and support Kubernetes workloads including pods, deployments, and stateful services.
- Maintain monitoring dashboards and configure alerting rules to improve system observability.
- Assist with application deployments and release activities through CI/CD pipelines.
- Troubleshoot issues across infrastructure, application, and networking layers.
- Support cloud infrastructure operations and ensure efficient resource utilization.
- Contribute to automation of operational and deployment tasks.
- Participate in on-call rotations and operational support activities.
- Maintain runbooks, operational documentation, and troubleshooting guides.
Skills
- Linux administration, Google Cloud Platform (GCP) or similar cloud platforms, Kubernetes, Docker, CI/CD pipelines, Prometheus, Grafana, Kibana, cloud monitoring and logging tools, incident management, production monitoring, log analysis, troubleshooting distributed systems, scripting with Bash or Python, networking fundamentals (DNS, TCP/IP, ports, load balancing), DevOps practices, infrastructure support, and system reliability engineering concepts.
Required Skills
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Senior Manager / Associate Director - Business Finance
Emeritus
Application Developer
Accenture services Pvt Ltd
Intermediate AI Engineer ? Python, RAG, Agentic AI, ADK, MCP, GCP, Vertex AI, IBM Watsonx
UPS
Technical Architect
SourceFuse
Developer ? Monitoring Operations
People Prime Worldwide
Share
Quick Apply
Upload your resume to apply for this position