XX
Machine Learning Systems InternBrainChipUnited States

Dieses Stellenangebot ist nicht mehr verfügbar

XX

Machine Learning Systems Intern

BrainChip
  • US
    United States
  • US
    United States

Über

Hybrid SSM‑Transformer models have a unique advantage for on‑chip memory efficiency: SSM layers
compress sequence history into a fixed‑size recurrent state Attention layers
store key‑value caches that grow with context length This leads to an important design question: For a given model configuration and maximum context length, can on‑chip SRAM be sized so that inference runs entirely on chip—eliminating the need for slower off‑chip HBM or DRAM? What the intern will work on: The intern will model and analyze memory behavior during inference of hybrid SSM‑Transformer models, with a focus on avoiding off‑chip memory accesses. Key responsibilities include: Modeling data movement between
SRAM and HBM/DRAM
during inference Sweeping parameters such as: SRAM capacity Mapping the
feasibility boundary
where inference can be performed fully on chip Breaking down
per‑layer memory working sets Identifying
when and why memory spills occur Exploring
tiling and scheduling strategies
to extend the no‑spill region Validating analytical results through
simulation
#J-18808-Ljbffr
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.