Bestkaam Logo
PT. Indosat Tbk Logo

Site Reliability Engineer

Actively Reviewing the Applications

PT. Indosat Tbk

On-site
Posted 5 hours ago β€’ Apply by June 11, 2026

Job Description

Role Summary

We are seeking a skilled and passionate Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our hybrid and cloud-native infrastructure. You will play a critical role in automating operations, improving system resilience, and supporting mission-critical services running across Kubernetes and cloud environments.This role is ideal for engineers who enjoy solving complex infrastructure challenges, building automation, and improving platform reliability at scale

Job Description (1/2)

Reliability & System Performance

  • Maintain high availability, scalability, and performance of production systems
  • .Define and monitor SLIs, SLOs, and error budgets to ensure service reliability.
  • Perform root cause analysis, incident response, and postmortem reviews.
  • Implement reliability improvements and proactive failure prevention.

Cloud & Kubernetes Platform Management

  • Manage and optimize workloads running on Google Kubernetes Engine (GKE) and OpenShift.
  • Support multi-cluster and hybrid infrastructure environments.
  • Implement autoscaling and high availability architecture

CI/CD, GitOps & Release Engineering

  • Design and maintain CI/CD pipelines using GitLab CI/CD.
  • Implement GitOps deployment workflows using Argo CD.
  • Implement safe deployment strategies including:

πŸ”Ή Infrastructure as Code & Automation

  • Provision and manage infrastructure using Terraform / OpenTofu.
  • Develop and maintain Helm charts for Kubernetes deployments.
  • Automate operational tasks using Python scripting to reduce manual toil.

Job Description 2/2

πŸ”Ή Observability, Monitoring & Distributed Tracing

  • Implement centralized logging using Grafana Loki and ELK Stack.
  • Build dashboards and alerts using Grafana and Datadog.
  • Implement distributed tracing using OpenTelemetry to improve system visibility.
  • Improve monitoring coverage and alert accuracy.

πŸ”Ή Performance & Load Testing

  • Conduct load and stress testing using tools such as k6, Locust, or JMeter.
  • Analyze performance bottlenecks and implement tuning strategies.
  • Support capacity planning and performance optimization.

πŸ”Ή Data Streaming & Integration

  • Support Change Data Capture (CDC) and real-time data streaming pipelines.
  • Work with Confluent Platform / Apache Kafka to ensure reliable event-driven data flow.

πŸ”Ή Security & Secret Management

  • Manage secrets securely using Google Cloud Secret Manager and Kubernetes secrets, Vault Hashicorp.
  • Implement secure CI/CD and platform access practices.

Education

Bachelor’s degree in Computer Science, Informatics, Information Systems, Electrical Engineering, Mathematics/Statistics, or related field.

Experience

  • 0–4 years of experience in SRE, DevOps, Cloud Engineering, or Platform Engineering.
  • Hands-on experience supporting production systems and cloud infrastructure.

Technical Skills

  • Strong Linux system administration and networking fundamentals.
  • Hands-on experience with Kubernetes and containerized environments.
  • Experience designing and maintaining CI/CD pipelines.
  • Infrastructure as Code experience (Terraform), Ansible.
  • Helm chart development and Kubernetes deployment management.
  • Monitoring, logging, and observability best practices.
  • Programming/scripting skills in Bash, Python (Go is a plus).
  • Familiarity with Google Cloud Platform (GCP).
Check Qualification

Quick Tip

Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.