Senior Speech AI Engineer – On-Device ASR & Real-Time Pronunciation Intelligence
Actively Reviewing the ApplicationsCapital Numbers
Job Description
We are looking for a Senior Speech AI Engineer to build production-grade, on-device Automatic Speech Recognition (ASR) and real-time speech intelligence systems.
In this role, you’ll work across the full speech AI lifecycle — from audio data pipelines and model development to low-latency streaming inference and edge deployment. You’ll help deliver accurate transcription, phoneme-level alignment, and real-time pronunciation feedback optimized for mobile and edge devices.
Key Skills & Experience
✔ 5–8+ years in Speech AI / Audio ML
✔ Strong Python & PyTorch expertise
✔ Experience with ASR models such as Whisper, Conformer, RNN-T, wav2vec 2.0, HuBERT
✔ Knowledge of speech processing & phoneme alignment
✔ Experience optimizing models for edge / mobile deployment (TensorFlow Lite, ONNX, PyTorch Mobile, CoreML)
✔ Familiarity with libraries like NVIDIA NeMo, ESPnet, SpeechBrain, torchaudio.
Nice to Have
• Experience with multilingual or low-resource ASR
• Work on pronunciation assessment or speech learning tools
• Experience with datasets such as Common Voice or LibriSpeech
If you’re passionate about building fast, accurate, and privacy-first speech AI systems, we’d love to connect.
Quick Tip
Customize your resume and cover letter to highlight relevant skills for this position to increase your chances of getting hired.
Related Similar Jobs
View All
Database Engineer II, FinTech
Amazon
QA Lead
Virtusa
ML Data Associate I
Amazon
Software Architect (Java/C++ , AWS, React JS)
Autodesk
Data Engineer Intern
Ericsson
Share
Quick Apply
Upload your resume to apply for this position