DEVOPS ENGINEER L4

jobtraffic

Ireland

Ireland

Find similar jobs

About

Senior Site Reliability / DevOps Engineer – Observability & Automation
Ready to apply Before you do, make sure to read all the details pertaining to this job in the description below.

The Senior Site Reliability / DevOps Engineer will be responsible for building resilient, scalable, and observable platforms through strong automation, infrastructure engineering, and SRE best practices. This role blends SRE, DevOps, and platform engineering with hands‑on programming and production ownership in complex, distributed environments.

Key Responsibilities

Design, build, and operate high‑reliability production platforms following SRE and DevOps principles.
Develop automation and tooling using Python and Go to reduce operational toil and improve system reliability.
Implement and maintain Ansible‑based automation for configuration management and infrastructure operations.
Design and operate CI/CD pipelines using Jenkins, GitHub Actions, GitLab, and Azure DevOps.
Implement Infrastructure as Code using Terraform and configuration management using Helm and Kustomize.
Support and operate containerized and cloud‑native workloads on Docker and Kubernetes.
Build, operate, and optimize observability platforms (metrics, logs, traces) using Prometheus, Grafana, ELK, Splunk, or similar tools.
Ensure deep visibility into system health, performance, and availability across distributed environments.
Troubleshoot and resolve critical production issues, performing root‑cause analysis and driving permanent fixes.
Partner with infrastructure, platform, and application teams to improve system reliability, scalability, and operability.

Required Skills & Experience

8+ years of experience in SRE, DevOps, Platform Engineering, or Production Engineering roles.
Strong programming expertise in Python (automation, scripting, internal tooling) and hands‑on experience with Ansible for automation and configuration management.
Strong understanding of Linux internals, networking, and distributed systems.
Proven experience with CI/CD pipelines and Git‑based workflows.
Hands‑on experience with Infrastructure as Code (Terraform) and configuration tooling (Helm, Kustomize).
Solid experience running containerized environments using Docker and Kubernetes.
Strong background in observability engineering (metrics, logs, traces).
Experience working with at least one cloud platform: AWS, Azure, or GCP.
Excellent troubleshooting skills and experience managing high‑severity production incidents.

Good to Have

Experience applying SRE concepts such as SLIs, SLOs, and error budgets.
Experience building internal developer platforms or reliability tooling. xcfaprz

Mandatory Skills: Site Reliability Engineering (SRE).

#J-18808-Ljbffr

Ireland

Languages

English

Notice for Users

This job was posted by one of our partners. You can view the original job source here.

Find similar jobs