Cognitive Collective

Helping you find your next career in AI. Learn more about the job board on the Scale blog.

Are you a scaling AI startup? Email maxwell@scalevp.com to be added to our board.
131
companies
4,295
Jobs

Perception Engineer

Tavus

Tavus

San Francisco, CA, USA · Remote
Posted on May 5, 2025

About Us

At Tavus, we're building the human layer of AI. Our mission is to make human-AI interaction as natural as face-to-face interaction, enabling the human touch where it has been previously unscalable. We achieve this through pioneering research in multi-modal AI models for human perception and understanding, combined with state-of-the-art human avatar rendering and communication models. Our models power everything from text-to-video AI avatars to real-time conversational video experiences across industries like healthcare, recruiting, sales, education, and more. By enabling AI to see, hear, and communicate with human-like authenticity, we're creating the foundation for the next generation of AI employees, assistants, and companions.

We're a Series A company backed by top investors, including Sequoia, Y Combinator, and Scale VC. Join us in driving the future of human-AI interaction. Check it out for yourself 😎

The Role

We’re looking for a Perception Engineer to help advance the core visual understanding systems behind Tavus’ AI-generated video experiences. In this role, you’ll work on foundational models and systems that enable our avatars to "see" and interpret the world - from facial dynamics and motion tracking to scene understanding and multi-modal perception.

You’ll join a small, fast-moving applied ML team where experimentation is encouraged, and ownership is expected. We’re not just iterating - we’re inventing. If you’re excited about solving real-world computer vision problems and shipping production-ready models that power next-gen human-AI interaction, we want to talk.

Your Mission 🚀

  • Develop and deploy perception models for tasks like emotion recognition, pose estimation, motion transfer, and scene parsing

  • Own the data pipelines and tooling necessary to train and evaluate these models at scale

  • Collaborate with researchers and engineers to integrate perception models into our real-time conversational product

  • Design and run experiments to optimize accuracy, speed, and robustness across diverse video conditions

  • Stay at the forefront of vision research and bring new ideas from paper to product

Requirements

  • 2-3+ years of experience building computer vision or ML systems in production

  • Strong Python and PyTorch skills, with experience in real-time or low-latency video applications

  • Deep understanding of at least two of the following: facial recognition, emotion recognition, generative vision models, or 3D reconstruction

  • You’re a self-starter with a bias toward action and a passion for solving hard, ambiguous problems

Bonus if you have:

  • Experience building inference systems optimized for performance and scale

  • Background in multi-modal learning or fusing vision + audio inputs

  • Published or implemented research papers in computer vision or generative media

  • Played Portal 1 and 2 - or willing to as part of onboarding 😄

Benefits

When you join Tavus, you’re joining a family. Our work is driven by our team, and our success is shared by all. This position has a flexible work schedule, unlimited PTO, competitive healthcare and gear stipends, as well as, of course, plenty of fun! At the end of the day, we want Tavus to be a place for you to learn, directly drive impact, and be with a team you love.

To learn more about our team culture, and benefits, check out our hiring page!

Tavus is growing fast, and we’d like you to grow with us! Are you excited to get your hands dirty? Drop your resume and we’ll be in touch!

We are not looking for cultural fits, we are looking for culture creators. In fact, diversity is what drives our success – it’s at the core of how we hire, communicate, and work. We are inclusive to all and combine our diverse backgrounds, skill sets, and thinking to build the best experiences for our clients.