Bestkaam Logo
Capital Numbers Logo

Senior Speech AI Engineer – On-Device ASR & Real-Time Pronunciation Intelligence

Actively Reviewing the Applications

Capital Numbers

India, Haryana, Gurugram Full-Time On-site
Posted 11 hours ago Apply by June 16, 2026

Job Description

We are looking for a Senior Speech AI Engineer to build production-grade, on-device Automatic Speech Recognition (ASR) and real-time speech intelligence systems.


In this role, you’ll work across the full speech AI lifecycle — from audio data pipelines and model development to low-latency streaming inference and edge deployment. You’ll help deliver accurate transcription, phoneme-level alignment, and real-time pronunciation feedback optimized for mobile and edge devices.


Key Skills & Experience

✔ 5–8+ years in Speech AI / Audio ML

✔ Strong Python & PyTorch expertise

✔ Experience with ASR models such as Whisper, Conformer, RNN-T, wav2vec 2.0, HuBERT

✔ Knowledge of speech processing & phoneme alignment

✔ Experience optimizing models for edge / mobile deployment (TensorFlow Lite, ONNX, PyTorch Mobile, CoreML)

✔ Familiarity with libraries like NVIDIA NeMo, ESPnet, SpeechBrain, torchaudio.


Nice to Have

• Experience with multilingual or low-resource ASR

• Work on pronunciation assessment or speech learning tools

• Experience with datasets such as Common Voice or LibriSpeech


If you’re passionate about building fast, accurate, and privacy-first speech AI systems, we’d love to connect.

Check Qualification

Quick Tip

Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.