CerebrasSystems

Performance & Reliability Engineer

Apply Now

At a Glance

Location
Canada
Experience
3+ years
Posted
2026-02-17T12:05:36-05:00

Key Requirements

Required Skills

Python

Domain Knowledge

  • Engineering

Requirements

BS, MS, or PhD in Computer Science, Electrical Engineering, or a related field.

3+ years of relevant experience in performance engineering, reliability, computer architecture, and/or software design.

Proficiency in Python or other scripting languages.

Experience with C/C++ and assembly programming.

Demonstrated expertise with system-level performance and reliability optimization.

Strong verbal and written communication skills.

Responsibilities

Join Cerebras as a Performance & Reliability Engineer within our innovative Co-Design and Next Generation Team. Our groundbreaking CS-3 system has set new benchmarks in high-performance ML training and inference solutions. It leverages a dinner-plate sized chip with 44GB of on-chip memory to surpass traditional hardware capabilities. This role focuses on characterizing and optimizing the performance and reliability of state-of-the-art AI models running on Cerebras' breakthrough hardware.

Characterize and enhance the performance and reliability of advanced ML hardware/software systems, with emphasis on reducing power and thermal fluctuations.

Analyze ML workloads, software kernels, and hardware architecture for power and performance impacts, and synthesize high-level insights across these layers.

Develop creative software solutions to improve reliability and performance, collaborating cross-functionally to deploy these solutions in production.

Influence the design of Cerebras' next-generation AI architecture and software stack through rigorous workload analysis and computational efficiency optimization.