ML Team Lead (Speech Synthesis)
Rime
Job Title: Team Lead - ML Scientist
Location: Hybrid + Flexible
Experience Level: Mid-Senior
About Us:
We are a cutting-edge speech ML startup at the forefront of innovation in audio and machine learning . Our mission is to make it easier for enterprise developers of high-impact voice applications to ship compelling experiences.
We’re looking for a highly motivated ML Engineer with a speech synthesis focus to join our dynamic team. This role offers the opportunity to solve complex engineering challenges, contribute to core product development, and grow into a leadership position as our team expands.
Building and maintaining our products and platform on top of our cutting-edge voice models, which power hundreds of millions of conversations every month.
Train SOTA models to synthesize speech in any language for real-time conversations that power hundreds of millions of brand experiences every month.
Work closely with the product engineering team to expose and integrate key features of our models into the product.
Define the technical roadmap for developing multimodal assistive agents for products, working closely with technical leads and product managers, and drive execution from concept to deployment. You will provide guidance based on your domain knowledge and experience, ensuring alignment with overall deployment strategy.
Requirements:
You've spent a few years mastering one or two specific areas: could be vocoders, LLMs, speech encoders (w2v2, X-hubert, etc.), diffusion, flow matching, etc.
Experience with distributed training over multiple nodes of GPUs
Experience processing extremely large amounts of text, audio, or video data for use downstream in experiment and training paradigms
Proficient with PyTorch
English language proficiency
Preferred Skills:
Understanding of the interface between text normalization and speech synthesis inference
Understanding of and/or experience with inference optimization techniques
Knowledge of current approaches to zero- and few-shot voice cloning.
Experience with architecting and maintaining complex MLOps systems
Hands-on experience with LLM-based approaches to speech synthesis
Basic familiarity with full duplex modeling of turn-taking a.k.a. speech-to-speech modeling
What We’re Looking For:
A self-starter with high initiative and the ability to work autonomously.
Proven track record of successfully building and deploying AI-powered solutions in an industry setting.
Excellent problem-solving skills and a solutions-oriented mindset.
Strong communication skills to collaborate effectively across teams.
An eye for the bigger picture and the ambition to take on leadership responsibilities in the near future.
What We Offer:
A collaborative and innovative work environment.
Full health benefits (Vision, Dental, Health)
Opportunities for growth and leadership as the team expands.
Competitive salary and equity options.
Flexible work hours and remote-first culture.
Appreciation for work-life balance.
Beautiful office in the heart of SF and close to public transit
If you’re excited to work on the cutting edge of Voice AI and machine learning technology, we’d love to hear from you!