Agentic AI Engineer - RL
Actively Reviewing the ApplicationsXenonStack Moments
India, Punjab
Full-Time
On-site
Posted 6 hours ago
•
Apply by June 14, 2026
Job Description
About Xenonstack
XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems, enabling people and organizations to gain real-time and intelligent business insights.
We Deliver Innovation Through
THE OPPORTUNITY
We are seeking an Agentic AI Engineer (Specialized in Reinforcement Learning) with 2–5 years of experience in applying RL to enterprise-grade systems. This role involves designing and deploying adaptive AI agents that continuously learn, optimize decisions, and evolve in dynamic environments.
You’ll work at the intersection of RL research, agentic orchestration, and real-world enterprise workflows — building agents that do more than automate, but truly reason, adapt, and improve over time.
Job Roles And Responsibilities
Reinforcement Learning Development
Technical Skills
At XenonStack, we believe in shaping the future of intelligent systems. We foster a culture of cultivation built on bold, human-centric leadership principles, where deep work, simplicity, and adoption define everything we do.
Our Cultural Values
WHY SHOULD YOU JOIN US?
XenonStack is the fastest-growing Data and AI Foundry for Agentic Systems, enabling people and organizations to gain real-time and intelligent business insights.
We Deliver Innovation Through
- Akira AI – Building Agentic Systems for AI Agents
- XenonStack Vision AI – Vision AI Platform
- NexaStack AI – Inference AI Infrastructure for Agentic Systems
THE OPPORTUNITY
We are seeking an Agentic AI Engineer (Specialized in Reinforcement Learning) with 2–5 years of experience in applying RL to enterprise-grade systems. This role involves designing and deploying adaptive AI agents that continuously learn, optimize decisions, and evolve in dynamic environments.
You’ll work at the intersection of RL research, agentic orchestration, and real-world enterprise workflows — building agents that do more than automate, but truly reason, adapt, and improve over time.
Job Roles And Responsibilities
Reinforcement Learning Development
- Design, implement, and train RL algorithms (PPO, A3C, DQN, SAC) for enterprise decision-making tasks.
- Develop custom simulation environments to model business processes and operational workflows.
- Experiment with reward function design to balance efficiency, accuracy, and long-term value creation.
- Build production-ready RL-driven agents capable of dynamic decision-making and task orchestration.
- Integrate RL models with LLMs, knowledge bases, and external tools for agentic workflows.
- Implement multi-agent systems to simulate collaboration, negotiation, and coordination.
- Deploy RL agents on cloud and hybrid infrastructures (AWS, GCP, Azure).
- Optimize training and inference pipelines using distributed computing frameworks (Ray RLlib, Horovod).
- Apply model optimization techniques (quantization, ONNX, TensorRT) for scalable deployment.
- Develop pipelines for evaluating agent performance (robustness, reliability, interpretability).
- Implement fail-safes, guardrails, and observability for safe enterprise deployment.
- Document processes, experiments, and lessons learned for continuous improvement.
Technical Skills
- 2–5 years of hands-on experience with Reinforcement Learning frameworks (Ray RLlib, Stable Baselines, PyTorch RL, TensorFlow Agents).
- Strong programming skills in Python; proficiency with PyTorch / TensorFlow.
- Experience designing and training RL algorithms (PPO, DQN, A3C, Actor-Critic methods).
- Familiarity with simulation environments (Gymnasium, Isaac Gym, Unity ML-Agents, custom simulators).
- Experience in reward modeling and optimization for real-world decision-making tasks.
- Knowledge of multi-agent systems and collaborative RL is a strong plus.
- Familiarity with LLMs + RLHF (Reinforcement Learning with Human Feedback) is desirable.
- Exposure to cloud platforms (AWS/GCP/Azure), containers (Docker, Kubernetes), and CI/CD for ML.
- Strong analytical and problem-solving mindset.
- Ability to balance research depth with practical engineering for production-ready systems.
- Collaborative approach, working across AI, data, and platform teams.
- Commitment to Responsible AI (bias mitigation, fairness, transparency).
At XenonStack, we believe in shaping the future of intelligent systems. We foster a culture of cultivation built on bold, human-centric leadership principles, where deep work, simplicity, and adoption define everything we do.
Our Cultural Values
- Agency – Be self-directed and proactive.
- Taste – Sweat the details and build with precision.
- Ownership – Take responsibility for outcomes.
- Mastery – Commit to continuous learning and growth.
- Impatience – Move fast and embrace progress.
- Customer Obsession – Always put the customer first.
- Obsessed with Adoption – Making AI agents accessible and enterprise-ready.
- Obsessed with Simplicity – Turning complex RL + agentic challenges into intuitive, reliable systems.
WHY SHOULD YOU JOIN US?
- Agentic AI Product Company
- A Fast-Growing Category Leader
- Career Mobility & Growth
- Global Exposure
- Create Real Impact
- Culture of Excellence
- Responsible AI First
Required Skills
Engineering
Negotiation
Simulation
Monitoring
Python
Product Marketing
Cloud Platforms
Training
Coordination
AWS
Research
Docker
Kubernetes
TensorFlow
PyTorch
Reinforcement Learning
Azure
Unity
CI/CD
Continuous Improvement
Continuous Learning
System Design
Human intelligence
Algorithms
Orchestration
Development Design
Bias mitigation
Modeling
Quantization
Guardrails
Value creation
Optimization techniques
Simplicity
Model optimization
Business Insights
Robustness
Experiment
Multi-Agent Systems
Inference
Distributed computing
LLMs
Observability
Precision
Computing
BFSI
SAFe
RLHF
Agentic AI
TensorRT
Simulators
Containers
AI Agents
Adaptive
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
AVP - Finance Operations
hackajob
India
Full-Time
IN_Senior Associate_DevOps_Data & Analytics_Advisory_PAN India
PwC India
Communication
Networking
Monitoring
+28
Global Compensation Analyst
Robert Walters
India
Full-Time
₹1–1 LPA
Communication
Financial Analysis
Attention to Detail
+30
Gis Engineer
Sharma Enterprises
India
Full-Time
Engineering
GIS
Data pipelines
Full Stack Data Engineer
Ford Motor Company
Chennai
Full-Time
Python
SQL
Share
Quick Apply
Upload your resume to apply for this position