Senior HPC & DevOps EngineerAdvanced Micro Devices • London, England, United Kingdom
Senior HPC & DevOps Engineer
Advanced Micro Devices
- London, England, United Kingdom
- London, England, United Kingdom
À propos
AMD together we advance_
THE ROLE: We are seeking a highly skilled Senior HPC & DevOps Engineer with experience in managing both high-performance computing clusters and modern DevOps infrastructure. The ideal candidate combines expertise in
Slurm-managed HPC clusters ,
GPU compute environments ,
CI/CD pipelines , and
Kubernetes-based orchestration . This person thrives in collaborative, fast-paced environments, drives technical execution with minimal oversight, and has a passion for building reliable, scalable, and high-performance systems.
THE PERSON: The ideal candidate is a skilled engineer with a strong background in DevOps, site reliability, or infrastructure engineering. They are proficient in Kubernetes, CI/CD tools, scripting (Python/Bash), and infrastructure automation frameworks such as Ansible. Experience working with GPU compute environments and integrating automated test workflows is highly valued. This person thrives in collaborative, fast-paced environments and can drive technical execution with minimal oversight. They bring a problem-solving mindset,
strong communication
skills, and a passion for building reliable, scalable systems.
KEY RESPONSIBILITIES:
Deploy, configure, and maintain HPC clusters using
Slurm .
Manage GPU compute nodes, high-speed interconnects, and parallel storage systems.
Design and maintain
CI/CD pipelines
using Buildkite, GitHub Actions, Jenkins.
Automate infrastructure provisioning and configuration with
Ansible, Terraform, Python, Bash .
Deploy containerized applications using
Docker, Kubernetes, Helm .
Monitor cluster health and performance; build dashboards with
Grafana, Prometheus, Checkmk .
Collaborate across teams to optimize workflows, troubleshoot issues, and document best practices.
PREFERRED EXPERIENCE:
Strong experience with
Slurm or equivalent HPC schedulers .
CI/CD, DevOps tools, and automation expertise.
GPU compute and lifecycle management (CUDA/ROCm).
Linux administration, shell scripting, and distributed systems troubleshooting.
Containerization and orchestration (Docker, Kubernetes, Helm).
Agile, collaborative mindset with strong communication skills.
ACADEMIC CREDENTIALS:
Bachelor's or master's degree in computer/software engineering, Computer Science, or related technical discipline
Solid years of industry experience
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.
#J-18808-Ljbffr
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.