togetherai

Engineering Manager, Model Serving

Preview — apply on company site for full detailsApply Now

At a Glance

Location: San Francisco, California, United States
Experience: 5+ years
Compensation: r this full-time position is: $250,000 - $300,000 + equity + benefits. Our sala
Posted: 2026-03-05T15:33:25-05:00

Key Requirements

Required Skills

Kubernetes

Domain Knowledge

SaaS

Benefits & Perks

Health Insurance

on, startup equity, health insurance and other competitive benefits. The US

Requirements

5+ years operating production ML inference or training systems at scale

2+ years in senior IC or tech lead roles, with demonstrated mentorship and technical leadership experience. Having built or scaled teams is a plus.

Deep expertise with Kubernetes, multi-cluster orchestration, and ML serving frameworks

Experience with multi-tenant SaaS platforms

Proven track record of SLA ownership with specific metrics (99.9% uptime, p99 latency targets)

Customer escalation and incident communication experience

Compensation & Benefits

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $250,000 - $300,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Responsibilities

Own availability and performance SLAs for production inference and fine-tuning services across serverless and dedicated deployments

Own & improve testing, deployment, configuration management, and monitoring practices for multi-cluster ML infrastructure – partnering closely with Infra SREs

Build self-serve tooling and automation to reduce operational toil and enable internal users (MLOps, customer experience) and self-serve offerings

Define and enforce configuration best practices for inference engines (SGLang, vLLM, etc) to prevent runtime issues

Lead incident response, conduct postmortems, and drive reliability improvements

Mentor team members and potentially grow into hiring/team building as the organization scales

About the Company

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.