Back to Jobs
XX
Site Reliability Engineer IIAkamai TechnologiesCambridge, Massachusetts, United States

This job offer is no longer available

XX

Site Reliability Engineer II

Akamai Technologies
  • US
    Cambridge, Massachusetts, United States
  • US
    Cambridge, Massachusetts, United States

About

Job Description
Are you passionate about cutting‑edge AI infrastructure? Do you want to build your SRE career on one of the most exciting platforms in cloud computing? Join the Akamai Inference Cloud Team
The Akamai Inference Cloud team is part of Akamai's Cloud Technology Group. We design, implement, deploy and operate AI platforms that enable customers to run inference models and developers to create AI applications.
Partner with the best In this role, responsibilities will include automation, monitoring, incident response, and working collaboratively with skilled team members. Candidates should possess expertise in Linux systems, automation, and SRE practices. Daily activities involve coding, improving dashboards, enhancing alerts, and minimizing repetitive tasks. Opportunities exist to focus on GPU infrastructure, Kubernetes, and ensuring reliability for AI workloads within Akamai's serverless inference platform.
As an Site Reliability Engineer II, you will be responsible for:
Building and maintaining dashboards, alerts, and monitoring for inference workloads using Akamai's existing observability platform
Writing automation and tooling in Python or Go to reduce operational toil and improve system reliability
Building and improving runbooks for inference‑specific operational procedures, integrating into Akamai's existing incident management processes
Contributing to SLO tracking and reporting, identifying trends and areas for improvement
Supporting CI/CD pipeline maintenance, deployment safety checks, and rollback procedures
Collaborating with product engineering teams to troubleshoot complex problems across the stack
Participating in on‑call rotations, responding to production incidents, and conducting blameless post‑mortems
To be successful in this role you will:
Have 2+ years of experience in Site Reliability Engineering and a Bachelor's Degree or its equivalent experience
Demonstrate coding ability in at least one programming language (Python or Go) with experience writing automation
Have experience with Linux systems administration and the ability to troubleshoot complex infrastructure issues
Show familiarity with Kubernetes and containerization concepts
Have experience with monitoring and observability tools such as Prometheus, Grafana, or similar
Have exposure to CI/CD pipelines and infrastructure‑as‑code tools (Terraform, SaltStack, or equivalent)
Show a willingness to learn and grow, with genuine curiosity about AI infrastructure and distributed systems
FlexBase: Work in a way that works for you FlexBase, Akamai's Global Flexible Working Program, is based on the principles that are helping us create the best workplace in the world. When our colleagues said that flexible working was important to them, we listened. We also know flexible working is important to many of the incredible people considering joining Akamai. FlexBase gives 95% of employees the choice to work from their home, their office, or both (in the country advertised). This permanent workplace flexibility program is consistent and fair globally, to help us find incredible talent, virtually anywhere. We are happy to discuss working options for this role and encourage you to speak with your recruiter in more detail when you apply.
Benefits
Your health
Your finances
Your family
Your time at work
Your time pursuing other endeavors
About Us Akamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences helping billions of people live, work, and play every day. With the world's most distributed compute platform from cloud to edge we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away.
Join us Are you seeking an opportunity to make a real difference in a company with a global reach and exciting services and clients? Come join us and grow with a team of people who will energize and inspire you!
Compensation Akamai is committed to fair and equitable compensation practices. For US based candidates only - the base salary for this position ranges from $95,000 - $171,000/year; a candidate’s salary is determined by various factors including, but not limited to, relevant work experience, skills, certifications and location. Compensation for candidates outside the US will vary. The compensation package may also include incentive compensation opportunities in the form of annual bonus or incentives, equity awards and an Employee Stock Purchase Plan (ESPP). Akamai provides industry‑leading benefits including healthcare, 401K savings plan, company holidays, vacation (in the form of PTO), sick time, family friendly benefits including parental leave and an employee assistance program including a focus on mental and financial wellness; Eligibility requirements apply.
#J-18808-Ljbffr
  • Cambridge, Massachusetts, United States

Languages

  • English
Notice for Users

This job was posted by one of our partners. You can view the original job source here.