AI QA Engineer
Actively Reviewing the ApplicationsTIGI HR
Job Description
Job Summary
We are seeking an AI QA Engineer to ensure the quality, accuracy, and performance of our enterprise-grade Natural Language to SQL (NL2SQL) pipeline. You will be responsible for validating a complex, multi-stage AI architecture—including semantic routing, LLM-based disambiguation, and query generation—ensuring it securely and accurately translates user intent into valid queries within the BFSI domain.
Experience: 7+ Years
Location: Gurugram
Work Mode: Hybrid - 3 Days WFO
Employment Type: Full-Time
Key Responsibilities
- LLM & Pipeline Evaluation: Design and execute automated evaluations for a 4-stage NL2SQL pipeline using LangSmith. Monitor metrics such as structural F1, execution accuracy, latency, and token cost.
- Dataset Management: Create, curate, and maintain benchmark/golden datasets for continuous regression testing of LLM prompts and model outputs.
- Search & Retrieval Testing: Validate precision and recall trade-offs in semantic search and schema discovery, ensuring optimal candidate selection for downstream query generation.
- Failure Analysis & Debugging: Perform root cause analysis across pipeline stages (routing, disambiguation, query generation, execution), identifying issues such as schema mismatches, type/coercion errors, runtime incompatibilities, and query structure failures.
- E2E & API Automation: Develop automated test scripts using Python (Pytest) for backend API testing and Playwright for the React frontend, validating end-to-end user workflows.
- Observability & Debugging: Utilize Grafana and structured JSONL logs to identify pipeline bottlenecks, LLM hallucinations, or prompt degradation.
- Compliance & Security: Ensure the AI pipeline meets strict BFSI data security standards, validating execution safety mechanisms (e.g., runtime capability probing, injection prevention); Ability to design validation rules and guardrails for AI pipelines to prevent invalid query generation and runtime failures.
Required Skills
- AI/LLM Testing: Experience testing LLM applications, RAG (Retrieval-Augmented Generation) pipelines, or NLP models. Familiarity with AI evaluation frameworks (e.g., LangSmith, DeepEval, or similar).
- Languages: Strong proficiency in Python 3.12+ (crucial for integrating with the existing AI backend and Pytest suite). Secondary experience with JavaScript/TypeScript.
- Test Automation: Expertise in API testing (REST) and optional UI automation using Playwright.
- Data & Search: Understanding of Vector Databases (e.g., Milvus, Pinecone) and semantic search concepts (embeddings, hybrid search).
- Data & SQL Validation: Solid understanding of SQL and data validation techniques to verify correctness of complex query outputs.
- Tools & Infrastructure: Git, Docker, CI/CD pipelines, and observability tools (Prometheus/Grafana).
Education
- BE / BTech / MCA / BSc in Computer Science, Data Science, or a related field.
Nice to Have
- Familiarity with Graph Databases (Neo4j) and LangGraph orchestration.
- Experience evaluating foundational LLM models (OpenAI, Anthropic, Google).
- Prior exposure to query languages like SQL or PURE or any other functional programming language.
- Experience testing workflows across multiple services or pipelines, with an understanding of failure handling, retries, and system reliability concepts.
- Experience in Banking, Financial Services, or Insurance domains
- Understanding of data security, compliance, and enterprise database schemas
Required Skills
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Office Manager & Business Development Manager
Ekrato Studio
Senior/Principal Associate - Employment - Manchester
Mills & Reeve
AI Data Trainer (Turkish), Operations Team, Alexa Shopping Operations
Amazon
Associate Project Manager - Order Management
Hitachi Energy
Art / Digital Illustrator
CG-VAK Software & Exports Ltd.
Share
Quick Apply
Upload your resume to apply for this position