Head of Engineering
Actively Reviewing the ApplicationsZocket
India, Tamil Nadu, Chennai
Full-Time
INR 65–80 LPA
Posted 6 days ago
•
Apply by June 29, 2026
Job Description
The Core Responsibilities For The Job Include The Following
AI and Agent Systems:
The core requirements for the job include the following:
AI and Agent Systems:
- Multi-Agent Orchestration: A fleet of specialized AI agents (creative generation, compliance checking, competitive analysis, and campaign intelligence) that coordinate, share context, and produce coherent outputs. We're working with agentic frameworks and building custom orchestration layers for enterprise-scale reliability.
- Brand Knowledge Graph (Neo4j): A proprietary knowledge graph capturing brand identity, guidelines, competitive positioning, market context, and campaign performance history. This graph is the shared memory and reasoning substrate for all agents.
- Context and Memory Layers: Short-term working memory for task execution, long-term memory for brand knowledge persistence, and retrieval mechanisms (RAG + graph traversal) that give agents the right context at the right time.
- Multimodal Brand Compliance: Validating generated content (text, image, video) against brand rules encoded in the knowledge graph, with feedback loops for continuous improvement.
- Video Generation and Consistency: Working with models like Google Veo for video generation, maintaining brand-consistent visual identity, scene coherence, and style continuity.
- AI Evals and Agent Monitoring: Evaluation pipelines for measuring agent output quality, brand compliance accuracy, hallucination rates, and task completion reliability. Continuous monitoring of agent behavior in production with alerting, drift detection, and regression tracking across prompt and model changes.
- API and Services Architecture: RESTful and event-driven microservices handling authentication, authorization, rate limiting, request validation, error handling, and graceful degradation. Clean API contracts between services, versioning strategies, and backward compatibility for enterprise clients.
- Real-Time Systems: WebSocket services for live Brand Intel Dashboard updates, agent status streaming, and collaborative features. Connection management, fan-out patterns, and graceful degradation at scale.
- Event-Driven Pipeline: Kafka for agent event streaming, async task processing, cross-service communication, and audit logging. Topic design, partitioning strategies, consumer group patterns, and delivery guarantees.
- Caching and State Management: Redis for agent context caching, session state, rate limiting, pub/sub for real-time features, and distributed locking for concurrent agent operations.
- Data Layer: PostgreSQL as the relational backbone for transactional data, user management, and campaign state. Neo4j for the knowledge graph. OLAP databases (ClickHouse / Lighthouse) for analytics workloads powering the Brand Intel Dashboard aggregations over millions of ad performance records, competitive benchmarks, and trend analysis.
- Data Collection Infrastructure: Web scraping (Apify, Crawlee), official API integrations, data normalization, and ingestion pipelines feeding both Neo4j and the OLAP layer.
- Containerization and Orchestration: Docker for service packaging, Kubernetes for orchestration, scaling, and service mesh. Infrastructure-as-Code (Terraform / Pulumi) for reproducible deployments.
- CI/CD Pipelines: Automated build, test, lint, security scan, and deployment workflows. GitOps-based deployment strategies with staged rollouts, canary deployments, and automated rollback for production safety.
- Observability Stack: Centralized logging, distributed tracing (OpenTelemetry), metrics collection (Prometheus/Grafana), and alerting. Full-stack visibility from API latency to Kafka consumer lag to LLM call performance to agent pipeline traces.
- Cloud Infrastructure: AWS-based infrastructure with cost-aware architecture. VPC design, IAM, secrets management, and security posture appropriate for enterprise clients handling sensitive brand data.
The core requirements for the job include the following:
- Frontend: Next.js, React, Tailwind CSS.
- Backend: Python, Node.js, WebSockets, REST APIs.
- Databases: PostgreSQL, Neo4j (knowledge graph), ClickHouse (OLAP/analytics), Redis (caching/state).
- Streaming and Messaging: Kafka.
- AI/LLM: Anthropic Claude, Google Gemini/Veo, RAG pipelines, custom agent orchestration.
- Infrastructure: AWS, Docker, Kubernetes, Terraform.
- CI/CD and Observability: GitHub Actions, ArgoCD, Prometheus, Grafana, and OpenTelemetry.
- Project and Communication: Linear, Slack.
Required Skills
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
USI | FY26 | Audit Services | Cloud Engineer - Senior Consultant
Deloitte
India
Full-Time
₹20–44 LPA
Root Cause Analysis
Prometheus
Grafana
+8
SAP Business Data Cloud Consultant (Mid-Level)
Infosys
Bengaluru
Full-Time
4–8 years
Adobe Illustrator
Parquet
Partitioning
+4
Risk Management with exposure to Market Risk and Market Data
FIS
Hyderabad
Full-Time
4–8 years
Capacity Planning
Event-driven architecture
SPARQL
+8
Corporate IT Support Engineer
Aera Technology
Gurugram
Full-Time
4–8 years
Capacity Planning
Event-driven architecture
SPARQL
+8
R&D Bushing Specialist
Hitachi Energy
Gurugram
Full-Time
4–8 years
Capacity Planning
Event-driven architecture
SPARQL
+8
Share
Quick Apply
Upload your resume to apply for this position