Gen AI Lead Engineer

Chennai, Tamil Nadu, India

3 weeks ago

Applicants: 0

Salary Not Disclosed

2 days left to apply

Job Description

GenAI Lead Engineer Role Overview We are seeking a highly skilled GenAI Lead who can drive the development, optimization, and deployment of advanced LLMs, VLMs, and multimodal AI systems. You will lead the GenAI team, translate business requirements into technical solutions, fine-tune foundation models, design retrieval architectures, and ensure all models are production-ready with optimized inference pipelines. Key Roles ? Lead the design, development, and enhancement of LLMs, VLMs, RAG systems, and multimodal generation pipelines for production use cases. ? Understand business requirements and convert them into scalable, high-performance AI model architectures and workflows. ? Fine-tune and customize Transformer-based models using proprietary datasets, advanced training strategies, and evaluation frameworks. ? Optimize tokenization, embedding generation, vector search, and retrieval flows for high-throughput applications. ? Develop high-performance inference pipelines using ONNX, TensorRT, quantization, batching, streaming, and GPU/accelerator optimizations. ? Ensure all models are production-grade?robust, scalable, monitored, and integrated into backend systems. ? Lead and mentor the GenAI engineering team, conduct code/model reviews, and drive overall technical direction. ? Research and evaluate cutting-edge architectures in multimodal models, generative AI, and retrieval-augmented techniques. Responsibilities ? Architect end-to-end GenAI systems including training, fine-tuning, inference Serving, and continuous model improvements. ? Work with backend teams to integrate models into scalable APIs using Triton, TensorRT, ONNX Runtime, vLLM, or custom inference engines. ? Build model evaluation pipelines?BLEU, ROUGE, alignment tests, hallucination checks, safety filters, and latency/throughput benchmarks. ? Own the roadmap for LLM/VLM improvements and drive experimentation with new architectures (Mixture-of-Experts, diffusion-based multimodal, etc.). ? Collaborate cross-functionally with product, backend, ML, and DevOps teams to deliver end-to-end GenAI features. ? Maintain documentation, ensure reproducibility, and follow best practices in model governance, versioning, and monitoring. ? Mentor the team in training deep learning models, optimizing memory/GPU usage, and deploying large-scale inference systems. Required Qualifications ? 4?8+ years of experience in applied machine learning, deep learning, GenAI, or multimodal systems. ? Proven expertise with Transformers, LLMs, VLMs, diffusion models, and retrieval-augmented systems. ? Hands-on experience with Python, PyTorch, TensorFlow, Hugging Face, LangChain, and modern training pipelines. ? Strong knowledge of vector databases (FAISS, Pinecone, Milvus, Chroma). ? Expert-level experience with ONNX, TensorRT, quantization, model optimization, and inference engines (vLLM, FasterTransformer, Triton). ? Solid understanding of distributed training, GPU utilization, mixed precision, and large-scale model serving. ? Ability to lead teams, plan AI architecture, review work, and deliver production-quality AI systems. Note - We accept International applicants also