AI Scientist ? Conversational & Voice Intelligence

Chennai, Tamil Nadu, India

3 weeks ago

Applicants: 0

Apply Now

TTS Whisper Bark Computer Science Natural Language Processing

Salary Not Disclosed

4 days left to apply

Job Description

About Zudu AI Zudu AI is building the next generation of human-like voice automation that replaces traditional call centers. Our AI agents handle complex customer interactions, understand context, and respond naturally across multiple languages and accents. We work with enterprises and CPaaS providers to deliver scalable, intelligent voice solutions that continuously learn and improve. Role Overview We are seeking an AI Scientist to design, build, and optimize advanced speech and conversational AI models ? spanning ASR, TTS, LLM fine-tuning, RAG pipelines, and emotion-driven voice synthesis. You?ll work with large-scale data, production systems, and next-gen architectures to make our agents sound truly human. Key Responsibilities Research & Development Develop and fine-tune speech recognition (ASR) , text-to-speech (TTS) , and natural language understanding (NLU) models. Build multi-turn conversational AI systems capable of contextual reasoning, grounding, and emotional intelligence. Explore and implement retrieval-augmented generation (RAG) and memory-based dialogue systems for long-term contextuality. Research prosody, emotion, and style transfer in speech synthesis for natural human-like delivery. Evaluate and integrate open-source models (e.g., Whisper, Bark, FastSpeech, VITS, GPT-family models). System & Data Integration Work closely with platform engineers to deploy models in low-latency, production-grade environments. Optimize inference performance on cloud and edge systems using quantization, distillation, and caching strategies. Collaborate with the voice pipeline team to align model outputs with telephony and CPaaS audio protocols . Experimentation & Evaluation Design and conduct experiments to benchmark accuracy, naturalness, latency, and engagement . Lead data annotation and synthetic voice data generation projects to improve training quality. Publish findings internally (and externally when possible) to maintain Zudu AI?s leadership in enterprise voice automation. Preferred Qualifications Master?s or Ph.D. in Computer Science, AI, Speech Processing, or Computational Linguistics . Strong background in deep learning , transformer architectures , and reinforcement learning . Hands-on experience with PyTorch , TensorFlow , or JAX . Expertise in one or more of the following areas: Automatic Speech Recognition (ASR) Text-to-Speech (TTS) Natural Language Processing (NLP) Multimodal or Emotion-aware AI Understanding of LLM fine-tuning , prompt engineering , and retrieval systems (RAG, FAISS, Milvus) . Experience deploying models with Kubernetes , TorchServe , or ONNX Runtime . Familiarity with speech datasets , phoneme-level modeling , and accent adaptation . Nice-to-Have Prior experience in telephony or conversational AI products . Contributions to open-source speech or NLP projects . Understanding of low-latency voice streaming pipelines (e.g., gRPC, WebRTC). Exposure to emotion detection , sentiment analysis , or paralinguistic research . Soft Skills Curiosity and scientific rigor. Ability to translate research into deployable, scalable solutions . Collaborative mindset ? able to work with engineers, product leads, and linguists. Excellent written and verbal communication.

Required Skills

TTS Whisper Bark Computer Science Natural Language Processing

Additional Information

Company Name: Zudu AI
Industry: N/A
Department: N/A
Role Category: Robotics Software Engineer
Job Role: Mid-Senior level
Education: No Restriction
Job Types: On-site
Gender: No Restriction
Notice Period: Less Than 30 Days
Year of Experience: 1 - Any Yrs
Job Posted On: 3 weeks ago
Application Ends: 4 days left to apply

Computer Science, Email Studio, Automation Studio +2