Gen AI/LLM Engineer_5+Yrs
Actively Reviewing the ApplicationsZorba AI
India, Telangana, Hyderabad
Full-Time
On-site
INR 11–25 LPA
Posted 4 hours ago
•
Apply by June 14, 2026
Job Description
A leading consulting firm operating in the Enterprise Generative AI and Large Language Model (LLM) services sector, delivering production-grade LLM solutions, retrieval-augmented systems, and custom generative AI products for enterprise clients across domains. The team focuses on building secure, scalable, low-latency inference services and automating model lifecycle workflows for on-prem and cloud deployments.
Position: LLM Engineer — On-site (India). We are hiring an experienced LLM engineer to design, fine-tune, and deploy LLM-based solutions that power search, summarization, agents, and domain-specific assistants.
Role & Responsibilities
Position: LLM Engineer — On-site (India). We are hiring an experienced LLM engineer to design, fine-tune, and deploy LLM-based solutions that power search, summarization, agents, and domain-specific assistants.
Role & Responsibilities
- Design, fine-tune, and validate LLMs for production use-cases—instruction tuning, supervised fine-tuning, and parameter-efficient tuning (LoRA/adapters).
- Implement retrieval-augmented generation (RAG) pipelines: embeddings, vector search, chunking, and context assembly for high-recall responses.
- Optimize inference for latency and cost: quantization, model pruning, batching, and deployment with optimized runtimes (CUDA, Triton, bitsandbytes where applicable).
- Build backend services and APIs to serve LLM inference and orchestration using containerized deployments (Docker/Kubernetes) and CI/CD pipelines.
- Collaborate with product, data engineering, and ML teams to integrate LLMs into production flows, monitor model performance, and set up automated retraining/rollbacks.
- Create reproducible training pipelines, implement evaluation suites, and produce documentation and runbooks for model governance and observability.
- 4+ years of hands-on experience working with LLMs or advanced NLP models in production contexts.
- Proficiency in Python for ML engineering and model development.
- Experience with PyTorch and Hugging Face Transformers for training and fine-tuning.
- Practical experience implementing RAG and vector search using tools like FAISS or similar vector databases.
- Familiarity with LangChain (or equivalent orchestration) and integration with LLM APIs (OpenAI, Anthropic, etc.).
- Experience containerizing and deploying ML services using Docker; familiarity with Kubernetes is a plus.
- Experience with inference optimizations: quantization (bitsandbytes), Triton, or GPU-accelerated serving.
- Exposure to distributed training frameworks (DeepSpeed) and cloud MLOps platforms (SageMaker, Azure ML, GCP AI Platform).
- Knowledge of monitoring, logging, and model-evaluation frameworks for production LLMs (MLflow, Prometheus, Grafana).
- Collaborative, engineering-driven culture with strong focus on ownership and rapid iteration.
- Opportunity to build end-to-end LLM products for enterprise clients and influence architecture decisions.
- On-site role with hands-on access to GPU infrastructure and cross-functional product teams.
Required Skills
Engineering
Documentation
Monitoring
Python
Training
Docker
Kubernetes
CI/CD Pipelines
Prometheus
Grafana
PyTorch
MLOps
Azure
Hugging Face Transformers
LangChain
MLflow
Data Engineering
CI/CD
NLP
Anthropic
RAG
Instruction tuning
Governance
CUDA
Orchestration
Consulting
Model development
Quantization
Vector
Generative
Transformers
Logging
GPU
Sagemaker
Fine-tuning
OpenAI
Low-latency
NLP models
Cloud deployments
Chunking
Vector Search
Large Language Model
LLM APIs
Retrieval-Augmented Generation
Inference
Model Lifecycle
Optimizations
Batching
Retrieval
LLMs
Triton
Observability
Generative AI
Azure ML
LLM
FAISS
LoRA
Augmented Generation
Retrieval-augmented
Vector Databases
Assembly
Embeddings
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Asst. Manager/ Manager – System Engg (Chillers)
Carrier
India
Full-Time
₹15–22 LPA
Engineering
Testing
Data Science Scientist
PepsiCo
India
Full-Time
₹12–20 LPA
Machine Learning
Data Analysis
Python
+3
Installation Manager
Project Furniture Residential
India
Full-Time
Communication
Safety Regulations
Logistics
+13
Full Stack Developer - AWS Serverless
OP
India
Contract
₹12–15 LPA
PostgreSQL
Python
Angular
+5
Cloud Security Engineer
STOXX
Mumbai
Full-Time
Python
Advertising
Performance appraisals
Share
Quick Apply
Upload your resume to apply for this position