scaleai

Tech Lead/Manager, Machine Learning Research Scientist- LLM Evals

Preview — apply on company site for full detailsApply Now

At a Glance

Location: United States
Experience: 5+ years
Department: Research
Posted: 2026-02-12T18:47:24-05:00

Key Requirements

Required Skills

Machine LearningNLP

Domain Knowledge

Education
Energy
Engineering
Government

Compensation & Benefits

Responsibilities

As the leading data and evaluation partner for frontier AI companies, Scale is dedicated to advancing the evaluation and benchmarking of large language models (LLMs). We are building industry-leading LLM evals, setting new standards for model performance assessment. Our mission is to develop rigorous, scalable, and fair evaluation methodologies to drive the next generation of AI capabilities.

Our Research teams work with the industry’s leading AI labs to provide high quality data and accelerate progress in GenAI research. As the Tech Lead Manager of the LLM Evals Research team, you will lead a talented team of research scientists and research engineers focused on developing and implementing novel evaluation methodologies, metrics, and benchmarks to assess the capabilities and limitations of our cutting-edge LLMs. This role is critical for designing and executing a roadmap that defines best practices in data driven AI development and will accelerate the next generation of generative AI models in partnership with top foundational model labs.

You will:

Lead a team of highly effective research scientists and research engineers on LLM evals.

Conduct research on the effectiveness and limitations of existing LLM evaluation techniques.

Design and develop novel evaluation benchmarks for large language models, covering areas such as instruction following, factuality, robustness, and fairness.

About the Company

At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Cisco, DLA Piper, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.

We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.

We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at accommodations@scale.com. Please see the United States Department of Labor's

Know Your Rights poster

for additional information.

We comply with the United States Department of Labor's