XX
DevOps / Site Reliability Engineer (AWS/Azure, Dynatrace)Purple DriveUnited States

Cette offre d'emploi n'est plus disponible

XX

DevOps / Site Reliability Engineer (AWS/Azure, Dynatrace)

Purple Drive
  • US
    United States
  • US
    United States

À propos

Role Summary
We are seeking a highly skilled
DevOps / Site Reliability Engineer (SRE)
to design, build, and operate scalable, reliable, and secure cloud platforms on
AWS and Azure . The ideal candidate will have strong experience in
Dynatrace monitoring and observability configurations , CI/CD automation, infrastructure reliability, and production support. This role focuses on improving system availability, performance, and operational excellence across complex distributed systems.
Key Responsibilities
DevOps & Cloud Engineering
Design, implement, and maintain cloud infrastructure on
AWS and Azure
using Infrastructure as Code (Terraform, ARM, Bicep, CloudFormation). Build and manage
CI/CD pipelines
using tools such as Jenkins, Azure DevOps, GitHub Actions, or GitLab CI. Automate provisioning, configuration management, and deployment processes. Support containerized and microservices-based architectures using
Docker and Kubernetes (EKS / AKS) . Site Reliability Engineering (SRE)
Define and enforce
SLIs, SLOs, and SLAs
to ensure service reliability. Lead incident response, root cause analysis (RCA), and post-incident reviews. Implement proactive reliability practices, capacity planning, and performance optimization. Reduce operational toil through automation and self-healing mechanisms. Monitoring & Observability (Dynatrace)
Configure and manage
Dynatrace
for full-stack observability across applications, infrastructure, and cloud services. Implement:
OneAgent deployments (VMs, containers, Kubernetes) Custom dashboards, alerts, and anomaly detection Service flow, distributed tracing, and RUM
Integrate Dynatrace with CI/CD pipelines, ITSM tools (ServiceNow), and alerting systems. Tune alerts to minimize noise and improve actionable insights. Security & Compliance
Implement cloud security best practices, including IAM, secrets management, and encryption. Integrate security and compliance checks into CI/CD pipelines. Collaborate with security teams on vulnerability remediation and audits. Collaboration & Operations
Work closely with development, QA, security, and platform teams. Provide on-call support for production systems as part of an SRE rotation. Document operational runbooks, standards, and best practices. Required Skills & Qualifications
Technical Skills
Strong hands-on experience with
AWS and/or Azure
(compute, networking, storage, monitoring). Proven experience in
Dynatrace configuration and administration . Expertise in
Linux/Unix
environments. Hands-on scripting experience (Bash, Python, PowerShell). Experience with
Kubernetes, Docker , and microservices architectures. Strong knowledge of CI/CD tools and Git-based version control. Experience with logging and monitoring tools (Dynatrace, Prometheus, Grafana, ELK). DevOps & SRE Practices
Solid understanding of
SRE principles , reliability engineering, and high availability design. Experience with incident management, RCA, and performance tuning. Familiarity with infrastructure automation and configuration management tools. Preferred Qualifications
Cloud certifications (AWS, Azure, or Kubernetes). Experience integrating Dynatrace with cloud-native services and Kubernetes. Knowledge of service mesh, API gateways, or event-driven architectures. Exposure to FinOps, cost optimization, or multi-cloud strategies. Key Competencies
Strong troubleshooting and problem-solving skills. Ability to work in high-availability, production-critical environments. Excellent communication and stakeholder collaboration skills. Continuous improvement mindset with a focus on automation and reliability.
  • United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre a été publiée par l’un de nos partenaires. Vous pouvez consulter l’offre originale ici.