Bestkaam Logo
Genrise.ai Logo

Senior LLM Evaluation & Reinforcement Fine-Tuning Engineer

Pune, Maharashtra, India

1 month ago

Applicants: 0

Salary Not Disclosed

N/A

Job Description

Company Description Genrise is a leading ecommerce content agent that specializes in identifying content gaps, creating high-performing product copy, and tailoring it for every marketplace. We deliver on-brand content for platforms like Amazon, Walmart, and Target, 10x faster. Our innovative approach ensures top-ranking content, making us a preferred choice for ecommerce businesses. Role Description We?re looking for a hands-on technical expert who has actually written evals for large language models and has direct experience with reinforcement fine-tuning (e.g., RLHF, RLAIF, or RFT variants). You?ll split your time between building/owning our LLM evaluation stack ? leveraging best practices in experimental design, measurement, and trustworthy deployment. If you love turning fuzzy product goals into measurable evaluations, care deeply about scientific rigor, and enjoy building cool tech, this is for you. What you?ll do Design, implement, and maintain robust evaluation suites for LLMs (task- and domain-specific; regression and exploratory). Lead or contribute to reinforcement fine-tuning projects (reward modeling, preference data pipelines, safety/quality constraints, offline/online tuning loops). Define success metrics, sampling strategies, and statistical tests; ensure reproducibility and leakage prevention. Build data generation and curation pipelines for evals (human + synthetic), including rubric design and inter-annotator agreement. Partner with research, product, and infra to ship models with quantifiable improvements and clear trade-off documentation. Teach & mentor: run workshops, code walkthroughs, and evaluations office hours; raise the scientific bar across the org. Write clear experiment reports and decision memos; contribute to internal best-practice guides. What we?re looking for (must-haves) Recent, hands-on experience delivering 1?2+ real projects where you authored LLM evals end-to-end (design ? implementation ? analysis). Demonstrated experience with reinforcement fine-tuning for LLMs (RLHF/RLAIF/RFT)?reward modeling, preference data, or policy optimization. Strong scientific foundation : experimental design, statistics, hypothesis testing, error analysis. Machine learning depth : transformers, tokenization, finetuning, sampling/decoding, data quality, overfitting/leakage controls. Proficiency with Python , PyTorch/JAX , and common LLM tooling (HF, vLLM, Triton, Ray/SLURM, Weights & Biases, etc.). Excellent written and verbal communication; proven ability to teach and mentor engineers/researchers. Nice to have Safety evals, hallucination/robustness/red-teaming experience. Evaluation of tool use/agents, code generation, retrieval-augmented tasks. Knowledge of ranking/recommender systems or bandits. Infra for eval orchestration (sharding, caching, dataset versioning). Contributions to open-source eval frameworks or benchmark leaderboards.

Additional Information

Company Name
Genrise.ai
Industry
N/A
Department
N/A
Role Category
Data Analyst
Job Role
Mid-Senior level
Education
No Restriction
Job Types
Remote
Gender
No Restriction
Notice Period
Less Than 30 Days
Year of Experience
1 - Any Yrs
Job Posted On
1 month ago
Application Ends
N/A

Similar Jobs

Gentrack Ltd (Global)

3 weeks ago

Software Engineer Intermediate

Gentrack Ltd (Global)

KnowBe4

1 month ago

Staff Data Scientist (Position located in Bengaluru, India)

KnowBe4

Skysoft Global

1 month ago

Finance Application Developer (Tally & PL/SQL)

Skysoft Global

Wipro

4 weeks ago

Developer L3

Wipro

Danfoss

1 month ago

Lead Engineer - Software Products

Danfoss

Wipro

1 month ago

Developer L3

Wipro

PwC India

3 weeks ago

IN_Senior Associate_Senior Developer_GCC_Advisory_Bangalore

PwC India

People Prime Worldwide

1 month ago

Data Quality Engineer

People Prime Worldwide

Avantor

1 month ago

Senior Data Analyst

Avantor

UPS

1 month ago

Senior Data Engineer - Data Bricks

UPS