AI ENGINEER (Speech, Vision & Multimodal Systems)

Hyderabad, Telangana, India

1 day ago

Applicants: 0

Apply Now

Python C Azure Google Cloud Show more Show

Salary Not Disclosed

3 weeks left to apply

Job Description

Job description Company Description Coin Earth is a leading provider of banking and payment solutions to financial institutions worldwide. Founded in 2017 in India, Coin Earth has experienced significant growth with the help of its associates in India. The company values customer feedback and aims to exceed expectations for continued success. About ChatBucket ChatBucket is building the future of global communication through cutting-edge AI technology. Our deep tech platform enables real-time translation for voice and video calls, breaking down language barriers and connecting people across the world seamlessly. We're a pre-seed startup backed by ambitious goals and a vision to revolutionize how people communicate. Job Title: AI Engineer (Speech, Vision & Multimodal Systems) Location: Hyderabad Experience: 2?3 Years Employment Type: Full-Time About the Role We are building next-generation multimodal AI systems across speech, text, and vision domains. As an AI Engineer, you will develop, optimize, and deploy real-time AI pipelines for communication, accessibility, and intelligent automation. You will work on LLMs, speech models, vision models, and hybrid pipelines, ensuring they run efficiently at scale and deliver a seamless user experience. Key Responsibilities ? Build and maintain AI pipelines involving speech recognition, language processing, and speech synthesis. ? Develop real-time inference systems for audio-video communication and multilingual interactions. ? Implement OCR-driven reading and accessibility features, including text extraction, normalization, and spoken output. ? Work on image analysis and summarization systems involving object detection, scene understanding, and caption generation. ? Fine-tune and adapt large language models (LLMs) for conversational tasks, summarization, classification, and structured output generation. ? Optimize AI models using distillation, quantization, pruning, batching, and memory-efficient inference strategies. ? Design multi-user, multi-threaded serving architectures with concurrency handling, async processing, and latency optimization. ? Develop APIs and microservices to integrate AI capabilities into production systems with high reliability. ? Build monitoring, logging, and evaluation frameworks to ensure stable performance across real-time applications. ? Collaborate with cross-functional teams to modularize pipelines, improve scalability, and enhance end-user experience. Required Skills & Experience ? 2?3 years of experience in ML/AI engineering or deep learning development. ? Strong proficiency in Python and model development workflows. ? Experience working in at least one domain: speech AI, NLP/LLMs, or computer vision. ? Understanding of neural architectures (CNNs, RNNs, Transformers) and training pipelines. ? Hands-on knowledge of LLM fine-tuning, prompt engineering, or custom dataset training. ? Exposure to model deployment, microservices, and containerization. ? Good understanding of parallel processing, async programming, and multi-threaded systems. ? Ability to work with GPU-based workloads and optimize performance. Nice to Have ? Experience with real-time or streaming AI systems (audio, video, or multimodal). ? Background in LLM optimization (LoRA, QLoRA, PEFT, quantization). ? Experience with voice interfaces, intelligent assistants, or automation bots. ? Familiarity with cloud GPU deployments and production-grade scaling. ? Exposure to active learning, data curation, or evaluation frameworks. What We Offer ? Opportunity to work on cutting-edge multimodal AI products with global impact. ? Ownership of core AI components from research to deployment. ? Collaborative environment with access to advanced GPU infrastructure. ? Growth in LLM engineering, speech AI, and multimodal system design. Tech Stack You?ll Work With Languages: Python, C/C++ Frameworks: PyTorch, TensorFlow, Keras, scikit-learn Optimization: ONNX Runtime, TensorRT APIs: FastAPI, Flask, gRPC Deployment: Docker, AWS, Azure, Google Cloud