samsungsemiconductor

Senior Staff Engineer, Memory Fault Management Architect

Apply Now

At a Glance

Location
San Jose, California, United States
Employment
employment_required
Experience
15+ years
Posted
2026-03-20T14:24:39-04:00

Key Requirements

Required Skills

ExcelLinux

Domain Knowledge

  • Engineering
  • Medical

Requirements

Knowledge of platform memory subsystem, platform RAS (Reliability Availability Serviceability) such as ECC, page offlining, hPPR and hardware sparing.

ECC design and verification and reverse engineering experience.

Understanding on the address mapping between CPU and memory.

Memory controller register modification.

DRAM and HBM failure mode understanding.

An avid learner, you approach challenges with curiosity and resilience, seeking data to help build understanding.

Responsibilities

Based on the knowledge of SOC controller and memory operation including RAS feature, find and recommends better solution to mitigate the field DRAM failure rate.

Needs to communicate better ECC scheme to customers based on Samsung DRAM failure mode(DQ and burst)

Interface with customers to establish the value add of enabling in-field fault management architecture

Contribute to the standardization of DRAM/HBM failure logging in the OCP.

Propose and develop platform RAS (Reliability Availability Serviceability) algorithms for memory fault management such as page offlining, hPPR and conduct POC with known failure DIMMs in the real server and application.