Gen AI Engineer I

Actively Reviewing the Applications

Eros Innovation

India, Tamil Nadu, Chennai Full-Time On-site

Posted 4 days ago • Apply by June 1, 2026

Job Description

Company Description

Eros Innovation is a global technology company operating at the intersection of Artificial Intelligence, media, and next generation digital platforms. We focus on building advanced Generative AI solutions, multimodal systems, and scalable AI infrastructure that power real-world enterprise applications.

At the core of our ecosystem is Eros Gen AI our proprietary Generative AI platform designed to deliver cutting-edge capabilities across large language models (LLMs), vision-language systems, speech AI, and retrieval-augmented intelligence. Eros Gen AI drives both research innovation and production-grade deployments, enabling intelligent automation and AI-driven transformation at scale.

If you’re passionate about building impactful AI systems and working on frontier technologies, Eros Innovation is where innovation meets execution.

Role Description

We are seeking a skilled Gen AI Engineer to develop, optimize, and deploy advanced LLMs, VLMs, and multimodal AI systems. You will work on fine-tuning foundation models, designing retrieval architectures, and building production-ready inference pipelines for scalable AI solutions.

Develop and enhance LLMs, VLMs, RAG systems, and multimodal generation pipelines for production use cases.
Understand business requirements and convert them into scalable, high-performance AI model architectures and workflows.
Fine-tune and customize Transformer-based models using proprietary datasets, advanced training strategies, and evaluation frameworks.
Optimize tokenization, embedding generation, vector search, and retrieval flows for high-throughput applications.
Develop high-performance inference pipelines using ONNX, TensorRT, quantization, batching, streaming, and GPU/accelerator optimizations.
Ensure all models are production-grade robust, scalable, monitored, and integrated into backend systems.
Research and evaluate cutting-edge architectures in multimodal models, generative AI, and retrieval-augmented techniques.
Design end-to-end GenAI systems including training, fine-tuning, inference serving, and continuous model improvements.
Work with backend teams to integrate models into scalable APIs using Triton, TensorRT, ONNX Runtime, vLLM, or custom inference engines.
Build model evaluation pipelines BLEU, ROUGE, alignment tests, hallucination checks, safety filters, and latency/throughput benchmarks.
Experiment with new architectures (Mixture-of-Experts, diffusion-based multimodal, etc.) and contribute to LLM/VLM improvements.
Collaborate with product, backend, ML, and DevOps teams to deliver end-to-end GenAI features.
Maintain documentation, ensure reproducibility, and follow best practices in model governance, versioning, and monitoring.

Qualifications

2–4 years of experience in applied machine learning, deep learning, GenAI, or multimodal systems.
Proven expertise with Transformers, LLMs, VLMs, diffusion models, and retrieval-augmented systems.
Hands-on experience with Python, PyTorch, TensorFlow, Hugging Face, LangChain, and modern training pipelines.
Strong knowledge of vector databases (FAISS, Pinecone, Milvus, Chroma).
Solid experience with ONNX, TensorRT, quantization, model optimization, and inference engines (vLLM, FasterTransformer, Triton).
Understanding of distributed training, GPU utilization, mixed precision, and large-scale model serving.
Strong problem-solving skills and ability to deliver production-quality AI systems.