Head of Evaluation
Harvey
This job is no longer accepting applications
See open jobs at Harvey.See open jobs similar to "Head of Evaluation" Cognitive Collective.San Francisco, CA, USA
Posted 6+ months ago
Head of Evaluation
Why Harvey
Harvey will be a category-defining company for the application layer built on top of foundation models like GPT-4.
- Exceptional product market fit: multiple multi-million dollar deals with the largest professional service providers (e.g. PwC) and the largest law firms on Earth (e.g. Allen & Overy).
- Massive demand: 15,000 law firms on our waitlist.
- World-class team: ex-DeepMind, Google Brain, FAIR, Tesla Autopilot. Former founding engineers at $1B+ startups like Superhuman and Glean.
- Work directly with OpenAI to build the future of generative AI and redefine professional services.
- Top of market cash and equity compensation.
Challenges
We are building systems that can automate the most complex knowledge work in the world, e.g. billion dollar litigations and corporate transactions.
- Dealing with the most sensitive data in the world: client data from the largest companies in the world.
- Working past the edge of published AI research: tackling problems far beyond the complexity of existing AI benchmarks.
- Unsolved product, architectural, and business problems: natural language interfaces, prohibitively expensive evaluation of models, massive marginal costs, versioning / training / segregating models per task / legal system / practice area / client and client’s clients.
Role
We are looking for a technical lead who can own the development of our evaluation platform. In this role, you will:
- Build a team of 10-20 researchers and engineers with experience evaluating LLMs and large-scale AI systems.
- Lead research and development of novel model-based evaluation methods and language model programs for evaluating complex tasks in legal and professional services.
- Design and implement a red-teaming pipeline for our custom models and collaborate with other research teams to fine-tune models from human feedback.
- Train reward models that accurately reflect the preferences of top-tier domain experts.
- Experiment with synthetic data generation and LLM-based data augmentation to complement human-generated eval benchmarks.
Impact
- Lead research and development of Harvey’s evaluation platform.
- Contribute to a product that transforms the nature of professional services.
- Help define what it means for LLMs to effectively perform complex knowledge work tasks.
- Work directly with our founders, research, and product teams, as well as foundation model providers like OpenAI.
- Tackle unsolved research and engineering problems, including the hardest in the world relevant to LLMs in production.
Qualifications
- 5+ years experience leading highly-technical teams composed of both researchers and engineers.
- Experience evaluating large-scale AI systems in high-stakes settings.
- Technical: can serve as a tech lead and contribute substantially to our codebase as necessary.
- Ability to communicate complex technical outcomes to diverse stakeholders.
- Strong conviction in setting technical direction.
This job is no longer accepting applications
See open jobs at Harvey.See open jobs similar to "Head of Evaluation" Cognitive Collective.