Site Reliability Engineer
Mumbai Metropolitan Region
1 day ago
Applicants: 0
Share
3 weeks left to apply
Job Description
Job Summary Job Description - Site Reliability Engineer Site Reliability Engineer (SRE) is a core member of the IT Infrastructure team and is a critical role in ensuring the reliability, scalability, and performance of our enterprise software systems. The ideal candidate will have deep expertise in AWS cloud services, production deployment methodologies, and infrastructure automation, while working collaboratively across multiple technical teams to deliver robust, scalable solutions. Key Responsibilities Infrastructure & Cloud Operations Design, implement, and maintain highly available, scalable infrastructure on AWS cloud platform Manage AWS services including EC2, RDS, S3, VPC, CloudFormation, Lambda, ECS/EKS, and monitoring services Optimize cloud resource utilization and cost management strategies Ensure security best practices and compliance across cloud infrastructure Production Deployment & CI/CD Lead production deployment processes for enterprise software applications Design and implement robust CI/CD pipelines using tools such as Jenkins, GitLab CI, AWS CodePipeline, or similar platforms Establish deployment strategies including blue-green deployments, canary releases, and rollback procedures Monitor and troubleshoot production systems to ensure minimal downtime and optimal performance Infrastructure as Code & Automation Develop and maintain infrastructure as code using tools like Terraform, CloudFormation, or AWS CDK Create automation scripts and tools to reduce manual operational overhead Implement configuration management using tools such as Ansible, Puppet, or Chef Build self-healing systems and automated monitoring solutions Scripting & Programming Write efficient scripts in Python, Bash, Go, or other relevant programming languages Develop tools for system monitoring, alerting, and operational efficiency Contribute to internal tooling and automation frameworks Debug and optimize existing automation and deployment scripts Networking & Security Configure and manage cloud networking components including VPCs, subnets, security groups, and load balancers Implement network security best practices and troubleshoot connectivity issues Manage DNS, CDN, and other network services Ensure proper network segmentation and access controls Collaboration & Communication Work closely with DevOps, Database Administrators, System Administrators, and Software Development teams Participate in on-call rotation and incident response procedures Lead post-incident reviews and implement preventive measures Communicate technical concepts clearly to both technical and non-technical stakeholders Required Skills And Experience Minimum 3 years of experience in Site Reliability Engineering, DevOps, or similar role 5+ years preferred with demonstrated progression in responsibility and technical expertise Extensive hands-on experience with AWS cloud services and SysOps operations Proven track record in production deployment of enterprise software systems Strong understanding of CI/CD concepts and implementation experience Proficiency in infrastructure as code tools and methodologies Advanced scripting abilities in Python, Bash, Go, or similar programming languages Solid understanding of cloud networking concepts, security groups, VPCs, and load balancing Experience with containerization technologies (Docker, Kubernetes) Knowledge of monitoring and observability tools (CloudWatch, Prometheus, Grafana, ELK stack) Familiarity with database administration and performance optimization Understanding of security best practices and compliance frameworks Excellent professional written and spoken English communication skills Strong analytical and problem-solving abilities Experience working in cross-functional team environments Ability to work independently and manage multiple priorities effectively Customer-focused mindset with attention to detail Good To Have AWS certifications (Solutions Architect, SysOps Administrator, or DevOps Engineer) Experience with microservices architecture and serverless technologies Knowledge of disaster recovery and business continuity planning Background in performance tuning and capacity planning Experience with agile development methodologies Previous experience in enterprise environments with high availability requirements Educational Qualifications Bachelor?s degree in Computer Science, Information Technology, or related field (or equivalent experience). Location: Akasa Air Head Office - Mumbai Akasa Air does not solicit or accept any form of payment from candidates or institutions during its recruitment process. Any such claims are fraudulent and should be disregarded. Individuals engaging with unauthorized entities do so at their own risk. We encourage you to report any such incidents to [email protected] for appropriate action.
Additional Information
- Company Name
- Akasa Air
- Industry
- N/A
- Department
- N/A
- Role Category
- SRE (Site Reliability Engineer)
- Job Role
- Mid-Senior level
- Education
- No Restriction
- Job Types
- On-site
- Gender
- No Restriction
- Notice Period
- Less Than 30 Days
- Year of Experience
- 1 - Any Yrs
- Job Posted On
- 1 day ago
- Application Ends
- 3 weeks left to apply
Similar Jobs
Quick Apply
Upload your resume to apply for this position