togetherai
Engineering Manager, Model Serving
At a Glance
- Location
- San Francisco, California, United States
- Experience
- 5+ years
- Compensation
- r this full-time position is: $250,000 - $300,000 + equity + benefits. Our sala
- Posted
- 2026-03-05T15:33:25-05:00
Key Requirements
Required Skills
Domain Knowledge
- SaaS
Benefits & Perks
on, startup equity, health insurance and other competitive benefits. The US
Requirements
5+ years operating production ML inference or training systems at scale
2+ years in senior IC or tech lead roles, with demonstrated mentorship and technical leadership experience. Having built or scaled teams is a plus.
Deep expertise with Kubernetes, multi-cluster orchestration, and ML serving frameworks
Experience with multi-tenant SaaS platforms
Proven track record of SLA ownership with specific metrics (99.9% uptime, p99 latency targets)
Customer escalation and incident communication experience
Compensation & Benefits
We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $250,000 - $300,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.
Responsibilities
Own availability and performance SLAs for production inference and fine-tuning services across serverless and dedicated deployments
Own & improve testing, deployment, configuration management, and monitoring practices for multi-cluster ML infrastructure – partnering closely with Infra SREs
Build self-serve tooling and automation to reduce operational toil and enable internal users (MLOps, customer experience) and self-serve offerings
Define and enforce configuration best practices for inference engines (SGLang, vLLM, etc) to prevent runtime issues
Lead incident response, conduct postmortems, and drive reliability improvements
Mentor team members and potentially grow into hiring/team building as the organization scales
About the Company
Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.