GCP Data Engineer (Senior)
Actively Reviewing the ApplicationsInfogain
Bengaluru
Full-Time
Posted 2 days ago
•
Apply by June 11, 2026
Job Description
Roles & Responsibilities
Core Skills
Required Skills & Experience
Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP).
Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.
Core Skills
Required Skills & Experience
- 6–10 years of experience in data engineering or analytics, with 3+ years hands-on on GCP.
- Strong experience with PySpark, Dataproc, GCS, BigQuery, and JDBC ingestion.
- Proven experience migrating SAS workloads to PySpark or SQL-based systems.
- Hands-on knowledge of GCP Medallion architecture (Bronze/Silver/Gold).
- Understanding of Dataplex, IAM, policy tags, and secure data handling.
- Experience with CI/CD (Cloud Build/GitHub Actions) and workflow orchestration (Cloud Composer/Airflow).
- Strong problem-solving ability, debugging skills, and ability to guide teams through technical challenges.
- Experience with Vertex AI, ML Ops, and ML pipeline deployment.
- Knowledge of Delta/Iceberg/Hudi table formats on GCS.
- Exposure to real-time ingestion (Pub/Sub, Dataflow).
- Google Cloud Professional Data Engineer or Cloud Architect certification.
- Strong leadership and mentoring capabilities.
- Excellent communication skills to support developers, architects, and business teams.
- Ability to manage multiple priorities, resolve conflicts, and maintain steady progress under pressure.
- Hands-on Technical Leadership
- Work closely with development teams on a daily basis to guide solution design, troubleshoot issues, and resolve technical blockers.
- Enforce engineering best practices, coding standards, and architectural guidelines across all data pipelines and workloads.
- Perform design and code reviews, ensuring quality, scalability, and reliability of the platform.
- Data Engineering on GCP
- Lead development of ingestion pipelines via Direct JDBC connectivity from Oracle and Teradata into the Raw/Bronze layer on GCS.
- Develop and optimize PySpark workloads on Dataproc for data cleansing, transformation, and harmonization into the Curated/Silver layer.
- Contribute to design of the Gold layer in BigQuery, including table structures, partitioning, clustering, and performance optimization.
- Migration from SAS to GCP
- Translate existing SAS logic into PySpark, ensuring functional parity, improved performance, and operational efficiency.
- Provide guidance on PySpark coding patterns, UDFs, optimization strategies, shuffle/skew handling, and best practices for Dataproc jobs.
- BigQuery Engineering & Optimization
- Build and optimize SQL models, materialized views, and analytical datasets in BigQuery.
- Apply query optimization techniques, cost controls, and data modeling best practices (star/snowflake).
- Implement RLS/CLS for secure reporting and work with BI teams to integrate BigQuery into reporting tools.
- Vertex AI & ML Support
- Assist data scientists in building ML pipelines using Vertex AI (training, prediction, feature engineering).
- Guide integration of feature pipelines from Silver ? Vertex AI Feature Store.
- Ensure reproducibility, lineage, and model monitoring (drift, bias).
- Data Governance & Security (Dataplex + IAM)
- Implement and enforce governance standards using Dataplex, including cataloging, policy tags, and data domains.
- Ensure datasets follow proper IAM roles, tagging, and compliance (PII/PCI/PHI masking where needed).
- Support lineage metadata, DQ implementation, and documentation.
- Operations, Monitoring & Cost Optimization
- Optimize Dataproc clusters (autoscaling, Preemptibles), GCS storage lifecycle policies, and BigQuery costs.
- Establish monitoring dashboards, logs, alerts, and operational KPIs.
- Troubleshoot and resolve production issues, ensuring high availability and reliability.
- 6-8 Years
- Primary Skill: Data Engineering
- Sub Skill(s): Data Engineering
- Additional Skill(s): Big Data, GCP-Apps, Pyspark, BigQuery
Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP).
Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.
Required Skills
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Performance Marketing & Lead Generation Specialist
Overseas Desire Immigration Pvt. Ltd.
4–8 years
Retrieval-Augmented Generation
Adobe Illustrator
Pinia
+1
VP of Commercial Lending
Talently
Jaipur
Full-Time
4–8 years
Schema design
Query optimization
Adobe Illustrator
+2
Lead Data Scientist
EPAM Systems
India
Full-Time
₹8–12 LPA
Adobe Illustrator
Adobe Illustrator
InDesign
Talent Acquisition Sourcing Specialist - Contract to Hire
Stefanini North America and APAC
India
Full-Time
Adobe Illustrator
Adobe Illustrator
Site Reliability Engineers - Google Cloud Platform (GCP) | RedHat OpenShift administration
UPS
India
Full-Time
Prometheus
Grafana
Business Intelligence
+7
Share
Quick Apply
Upload your resume to apply for this position