Bestkaam Logo
ConglomerateIT India Logo

AWS Cloud Lead/Architect (OPS+ Application Support)

Hyderabad, Telangana, India

3 weeks ago

Applicants: 0

Salary Not Disclosed

5 days left to apply

Job Description

About us ConglomerateIT is a certified and a pioneer in providing premium end-to-end Global Workforce Solutions and IT Services to diverse clients across various domains. Visit us at http://www.conglomerateit.com Our mission is to establish global cross culture human connections that further the careers of our employees and strengthen the businesses of our clients. We are driven to use the power of global network to connect business with the right people without bias. We provide Global Workforce Solutions with affability. About job Job Title: AWS Cloud Lead/Architect (OPS+Application Support). Location: Hyderabad (onsite) Experience Level: 10+ years About the Role We?re looking for an experienced Cloud Engineering Operations Lead to ensure our AWS platforms and customer-facing applications remain secure, stable, observable, and cost-efficient. You?ll take ownership of production environments, lead incident management, optimize costs, and ensure every release, recovery, and runbook runs like clockwork. Key Responsibilities Oversee AWS operations across EC2, EKS, RDS, ALB/CloudFront, IAM/OIDC, VPC/TGW/SGs, including patching and system hygiene. Ensure application support readiness?manage runbooks, smoke testing, release validation, and rollback strategies. Maintain complete observability through dashboards, logs, metrics, traces, synthetics, and alert health. Manage backup and disaster recovery end-to-end: define policies, schedules, retention, cross-region copies, restore testing, and DR documentation with measurable RPO/RTO. Lead Sev-1/2 incident bridges, maintain clear communications, and ensure post-mortems result in actionable resolutions. Drive cost optimization with tagging, right-sizing, SP/RI coverage, and lifecycle cleanup (EBS/EIP/AMIs). Empower the team with guardrails, golden runbooks, and automation (Terraform, Ansible, Python) to reduce repetitive tasks. What You?ll Do Day-to-Day Triage and prioritize overnight alerts or hot issues, ensuring clarity of ownership. Keep dashboards accurate and alerts healthy?no noise, no misses. Review backups and restore points regularly, addressing any gaps proactively. Support releases and ensure predictable, clean deployments. Document learnings into runbooks to make the next fix faster and easier. Lead or delegate break/fix issues to ensure no unresolved incidents linger. Weekly & Monthly Rhythm Weekly: Run Ops Reviews covering incidents, deploys, alerts, costs, capacity, and backups. Conduct Observability Tune-Ups?remove noise, add missing metrics, and validate synthetics. Execute Backup/DR restore tests and maintain RPO/RTO evidence. Review patches and changes for success, rollbacks, and learnings. Monthly: Report service availability, SLOs, MTTR, failure rates, backup compliance, and cost insights. Eliminate recurring issues (e.g., noisy alerts or flaky deploys). Refresh top-used runbooks and validate DR for a key workload (tabletop or live restore). Success Looks Like One clear, reliable dashboard per service with accurate alert routing and low false positives. 100% backup success or retried jobs; successful monthly restore tests; documented RPO/RTO. MTTR trending down?issues resolved by first responders using runbooks. Fewer failed changes month-over-month with predictable release rollouts. Cost optimization stable or improved against growth with ?95% tagging compliance. What You Bring 8?10+ years of hands-on experience in cloud and application operations, with deep AWS expertise. Strong incident leadership and problem-solving skills with proven post-mortem execution. Solid automation experience using Terraform, Ansible, and Python to simplify operations. Proven success in running backups and DR drills with real restore validations. Strong understanding of cloud networking (VPC, TGW, SGs) and AWS operational best practices. Passion for reliability engineering, operational excellence, and mentoring teams toward self-sufficiency. Why Join Us You?ll be the backbone of cloud stability?owning reliability, visibility, and cost efficiency across modern AWS environments. If you?re passionate about automation, uptime, and smooth releases, this role will let you shape an environment where on-call is calm, and operations just flow.

Additional Information

Company Name
ConglomerateIT India
Industry
N/A
Department
N/A
Role Category
N/A
Job Role
Mid-Senior level
Education
No Restriction
Job Types
On-site
Gender
No Restriction
Notice Period
Less Than 30 Days
Year of Experience
1 - Any Yrs
Job Posted On
3 weeks ago
Application Ends
5 days left to apply

Similar Jobs

Accenture in India

4 weeks ago

Application Developer

Accenture in India

GKN Automotive

1 month ago

Senior Full Stack Engineer

GKN Automotive

Uplers

2 months ago

Senior Backend Engineer (NodeJS)

Uplers

Iklavya

2 months ago

Network Data Engineer with ACL | Noida| 4-10 years

Iklavya

Turing

3 weeks ago

Backend Python Engineer - 17852

Turing

WNS

3 weeks ago

REF82416J- Deputy Manager - Python Developer (Chennai/Pune/Gurgaon)-Actuarial

WNS

Oracle

3 weeks ago

Site Reliability Developer 4

Oracle

CGI

3 weeks ago

MES Engineer ? Python Developer

CGI

Sequence, SQL, C +1
SBS

3 days ago

R&D Java Senior Software Development Engineer 1

SBS

Buckman

3 days ago

Sr Lead Digital Software Engineer - Back End

Buckman