Back to Jobs

Data Scientist

Actively Reviewing the Applications

OmniMD

India, Ahmedabad, Gujarat, Hyderabad Full-Time INR 4–4 LPA

Posted 4 days ago • Apply by June 16, 2026

Job Description

Job Description

Position: Data Scientist – LLM & Applied AI

Experience: 2–8 Years

Employment Type: Full-Time

Domain: Healthcare AI / Digital Health / SaaS Platforms

Reporting To: CTO

Role Summary

We are seeking a highly hands-on Data Scientist with 2–8 years of experience who is deeply proficient in Large Language Models (LLMs)—both open-source and commercial—and has strong expertise in prompt engineering, applied machine learning, and local LLM deployments.

This role is not purely academic. The ideal candidate will work on real-world AI systems, including AI Frontdesk, AI Clinician, AI RCM, multimodal agents, and healthcare-specific automation, with a focus on production-grade AI, domain-aligned reasoning, and privacy-aware architectures.

Key Responsibilities

1. LLM Research, Evaluation & Selection

Evaluate, benchmark, and compare open-source LLMs (LLaMA-2/3, Mistral, Mixtral, Falcon, Qwen, Phi, etc.) and commercial LLMs (OpenAI, Anthropic, Google, Azure).
Select appropriate models based on latency, accuracy, cost, explainability, and data-privacy requirements.
Maintain an internal LLM capability matrix mapped to specific business use cases.

2. Prompt Engineering & Reasoning Design

Design, test, and optimize prompt strategies:
Zero-shot, few-shot, chain-of-thought (where applicable)
Tool-calling and function-calling prompts
Multi-agent and planner-executor patterns
Build domain-aware prompts for healthcare workflows (clinical notes, scheduling, RCM, patient communication).
Implement prompt versioning, prompt A/B testing, and regression checks.

3. Applied ML & Model Development

Build and fine-tune ML/DL models (classification, NER, summarization, clustering, recommendation).
Apply traditional ML + LLM hybrids where LLMs alone are not optimal.
Perform feature engineering, model evaluation, and error analysis.
Work with structured (SQL/FHIR) and unstructured (text, audio) data.

4. Local LLM & On-Prem Deployment

Deploy and optimize local LLMs using frameworks such as:
Ollama, vLLM, llama.cpp, HuggingFace Transformers
Implement quantization (4-bit/8-bit) and performance tuning.
Support air-gapped / HIPAA-compliant inference environments.
Integrate local models with microservices and APIs.

5. RAG & Knowledge Systems

Design and implement Retrieval-Augmented Generation (RAG) pipelines.
Work with vector databases (FAISS, Chroma, Weaviate, Pinecone).
Optimize chunking, embedding strategies, and relevance scoring.
Ensure traceability and citation of retrieved sources.

6. AI System Integration & Productionization

Collaborate with backend and frontend teams to integrate AI models into:
Spring Boot / FastAPI services
React-based applications
Implement monitoring for accuracy drift, latency, hallucinations, and cost.
Document AI behaviors clearly for BA, QA, and compliance teams.

7. Responsible AI & Compliance Awareness

Apply PHI-safe design principles (prompt redaction, data minimization).
Understand healthcare AI constraints (HIPAA, auditability, explainability).
Support human-in-the-loop and fallback mechanisms.

Required Skills & Qualifications

Core Technical Skills

Strong proficiency in Python (NumPy, Pandas, Scikit-learn).
Solid understanding of ML fundamentals (supervised/unsupervised learning).
Hands-on experience with LLMs (open-source + commercial).
Strong command of prompt engineering techniques.
Experience deploying models locally or in controlled environments.

LLM & AI Tooling

HuggingFace ecosystem
OpenAI / Anthropic APIs
Vector databases
LangChain / LlamaIndex (or equivalent orchestration frameworks)

Data & Systems

SQL and data modeling
REST APIs
Git, Docker (basic)
Linux environments

Preferred / Good-to-Have Skills

Experience in healthcare data (EHR, clinical text, FHIR concepts).
Exposure to multimodal AI (speech-to-text, text-to-speech).
Knowledge of model evaluation frameworks for LLMs.
Familiarity with agentic AI architectures.
Experience working in startup or fast-moving product teams.

Research & Mindset Expectations (Important)

Strong inclination toward applied research, not just model usage.
Ability to read and translate research papers into working prototypes.
Curious, experimental, and iterative mindset.
Clear understanding that accuracy, safety, and explainability matter more than flashy demos.

What We Offer