jobtraffic
Principal Platform EngineerjobtrafficIreland

This job offer is no longer available

jobtraffic

Principal Platform Engineer

jobtraffic
  • IE
    Ireland
  • IE
    Ireland

About

Role Overview:

We are seeking a highly experienced Principal Platform Engineer to design, build, and operate secure, scalable, and highly reliable cloud platforms. This role sits at the intersection of platform engineering, site reliability engineering (SRE), and infrastructure security, supporting mission‑critical distributed systems including financial services and blockchain‑based platforms. You will lead the development of resilient multi‑cloud infrastructure, drive reliability and observability standards, and enable engineering teams through self‑service platforms, automation, and GitOps‑based delivery models.

Key Responsibilities:Platform Engineering & Architecture
  • Design and operate large‑scale, multi‑region infrastructure across AWS, GCP, and Azure
  • Build and evolve Kubernetes platforms (EKS, AKS, GKE) for high‑availability production workloads
  • Define platform standards, golden paths, and reusable infrastructure patterns
  • Architect secure environments, including confidential computing and enclave‑based systems
  • Perform deep troubleshooting across Linux kernel, networking stack, storage, and system performance layers
  • Optimize systems for low‑latency and high‑throughput workloads (CPU pinning, NUMA awareness, IRQ tuning, disk I/O optimization)
  • Diagnose and resolve complex production issues using system‑level tools (e.g., perf, eBPF, strace, tcpdump)
  • Tune OS‑level parameters for containerized and distributed environments
Reliability Engineering (SRE)
  • Define and implement SLOs/SLIs and drive reliability improvements across services
  • Lead incident response, post‑incident reviews, and systemic resilience improvements
  • Improve MTTR through observability, automation, and operational excellence practices
  • Conduct failure‑mode analysis, chaos testing, and capacity planning
  • Infrastructure as Code & Delivery
  • Build fully automated infrastructure using Terraform, Terragrunt, and related tooling
  • Implement GitOps workflows using tools like Argo CD
  • Develop secure CI/CD pipelines with policy enforcement, provenance, and gated releases
  • Enable zero‑touch deployments and self‑service developer platforms
Observability & Monitoring
  • Define and implement observability strategies across metrics, logs, and traces
  • Work with tools such as Datadog, Prometheus, and OpenTelemetry
  • Improve alert quality, reduce noise, and build actionable runbooks
  • Drive adoption xcfaprz of distributed tracing and end‑to‑end visibility
Required Skills & Experience
  • 10+ years in Platform Engineering, SRE, DevOps, or Linux Systems Engineering roles
  • Deep expertise in Kubernetes (EKS, AKS, GKE) and cloud‑native architectures
  • Strong Linux systems knowledge, including kernel behavior, networking, and performance tuning
  • Proven experience in multi‑cloud environments (AWS, GCP, Azure)
  • Proven track record operating production systems with high availability (99.9%+)
  • Hands‑on experience with Infrastructure as Code (Terraform, Terragrunt)
  • Strong understanding of observability, monitoring, and incident response
  • Experience implementing GitOps and modern CI/CD pipelines
  • Programming/scripting experience (Go, Python, or Bash)

#J-18808-Ljbffr
  • Ireland

Languages

  • English
Notice for Users

This job was posted by one of our partners. You can view the original job source here.