Bestkaam Logo
HG Insights Logo

Staff/Senior DevOps Engineer (with Database focus)

Pune, Maharashtra, India

1 day ago

Applicants: 0

Salary Not Disclosed

3 weeks left to apply

Job Description

At HG Insights, we lead the way in technology intelligence, delivering AI-driven insights through advanced data science and scalable big data architecture. We're searching a strong DevOps to support and manage complex cloud operations and database systems. With the recent acquisitions of MadKudu and TrustRadius, we?ve created an agentic GTM ecosystem that eliminates manual handoffs, guesses, and siloed signals and we need a strategic seller to take it to market. Job Overview Seeking a Staff/ Senior DevOps Engineer experienced in managing data persistence and search layers?including ClickHouse, MongoDB, PostgreSQL, and Elasticsearch. The ideal candidate will have expertise in cloud-native operations across AWS and GCP, CI/CD automation using GitHub Actions and ArgoCD, Kubernetes management (EKS, GKE), and Infrastructure-as-Code with both legacy Terraform scripts and Terraform Cloud. Key Responsibilities Data Persistence & Search Management Administer, monitor, and optimize ClickHouse, MongoDB, PostgreSQL, and Elasticsearch clusters for performance, security, and high availability. Implement scaling strategies, backup and recovery solutions, and robust security policies for all data layers. Troubleshoot search infrastructure (Elasticsearch) issues and maintain healthy indexes and resilient data pipelines. Conduct capacity planning and performance analysis to optimize database infrastructure utilization and identify scaling constraints before they impact production Evaluate and implement vertical scaling strategies, read replicas, caching layers, and query optimization to maximize performance within infrastructure constraints Configure detailed database performance monitoring including query analysis, slow query identification, connection pooling metrics, and resource utilization tracking CI/CD & Source Code Workflows Develop and maintain CI pipelines in GitHub Actions, integrating automated testing, security scanning, and build stages. Configure ArgoCD for automated, auditable deployments to Kubernetes (EKS, GKE), ensuring safe rollouts and rollbacks. Oversee GitHub repository management, including branching strategies, pull request reviews, release tagging, and secrets handling. Cloud & Kubernetes Operations Provision, scale, and monitor AWS EKS and GCP GKE environments, ensuring secure and reliable application, database, and search workloads. Automate Kubernetes resource management, including persistent storage and networking for all platforms. Troubleshoot issues in clusters, deployments, and integrated services. Infrastructure as Code Maintain and refactor legacy Terraform configurations for infrastructure management. Deploy and operate infrastructure using Terraform Cloud with remote state, workspace organization, and policy enforcement. Collaboration & Documentation Work closely with engineering teams to optimize deployment flows, monitor system health, and deliver scalable solutions. Draft and own documentation for CI/CD workflows, Kubernetes management, and best practices for all managed data and search systems. Assist in the transition from legacy to modern infrastructure management practices. Observability & Monitoring Design and implement comprehensive monitoring solutions using Datadog to ensure complete visibility across all infrastructure components, applications, and data layers. Configure and maintain Datadog alerts, monitors, and notification channels with proper escalation policies for critical infrastructure events, database performance issues, and application health metrics. Establish centralized logging strategies using Datadog Log Management for applications, databases, and Kubernetes clusters, ensuring log aggregation, retention policies, and correlation with metrics and traces. Create and maintain Datadog dashboards for real-time monitoring of system performance, resource utilization, business metrics, and SLI/SLO tracking across all managed services. Implement Datadog APM (Application Performance Monitoring) and distributed tracing to monitor application performance and troubleshoot complex microservices. Required Skills & Qualifications Total work experience:10+ years. 5+years DevOps experience in cloud (AWS, GCP) and Kubernetes orchestration (EKS, GKE). Direct expertise with ClickHouse, MongoDB, PostgreSQL, and Elasticsearch administration and automation. Proficient with GitHub and GitHub Actions, ArgoCD, and both legacy Terraform CLI and Terraform Cloud. Advanced scripting skills (Python, Bashl) for process automation. Understanding cloud security, IAM, persistent storage, and network best practices. Experience with backup, recovery, and compliance for data and search systems. Bachelor?s degree in Computer Science, IT, related field, or equivalent experience. Excellent collaboration and communication skills. Preferred Skills Experience with policy-as-code, automated compliance, and troubleshooting complex deployments. Knowledge of hybrid architecture and cloud cost optimization. Familiarity with Elasticsearch scaling and tuning, integration with other data platforms.

Additional Information

Company Name
HG Insights
Industry
N/A
Department
N/A
Role Category
SRE (Site Reliability Engineer)
Job Role
Mid-Senior level
Education
No Restriction
Job Types
Remote
Gender
No Restriction
Notice Period
Less Than 30 Days
Year of Experience
1 - Any Yrs
Job Posted On
1 day ago
Application Ends
3 weeks left to apply

Similar Jobs

Uplers

1 day ago

Frontend Developer

Uplers

Amazon

1 day ago

System Development Engineer II

Amazon

WrkTalk DigiSec AI Pvt Ltd

1 month ago

Senior Back End Engineer

WrkTalk DigiSec AI Pvt Ltd

Rosmerta Technologies Limited

1 month ago

Rosmerta Technologies - Principal Data Scientist - Machine Learning & Generative AI

Rosmerta Technologies Limited

Netradyne

1 month ago

Product Success Manager

Netradyne

RingCentral

1 month ago

Senior Backend Engineer (Node.js)

RingCentral

BitGo

1 day ago

Backend Engineer E3 - Trade

BitGo

UPS

1 day ago

Senior MLOps / AIOps Platform Engineer - MLflow, GCP, Vertex AI, IBM Watsonx, Terraform

UPS

Accenture in India

2 days ago

Application Developer

Accenture in India

Turing

1 day ago

Python Developer - 17852

Turing