Site Reliability Engineer(local to Cincinnati, OH)AKAASA Technologies Inc. • Cincinnati, Ohio, United States

Dieses Stellenangebot ist nicht mehr verfügbar

Site Reliability Engineer(local to Cincinnati, OH)

AKAASA Technologies Inc.

Cincinnati, Ohio, United States

Cincinnati, Ohio, United States

Ähnliche Jobs finden

Über

Requirements

· years of experience in Cloud SRE, DevOps, Infrastructure, or related engineering roles

· years working with databases, web applications, microservices, event-driven systems, messaging platforms, REST APIs, integrations, and containerized environments

· Strong knowledge of Java, Spring Boot, microservices architecture, Kafka, Cassandra, and SQL Server

· Proficiency in Python and Shell scripting for automation and operational tooling

· year managing observability platforms such as Dynatrace, ELK, PagerDuty, Datadog, Azure Monitor, or Grafana

· Hands-on experience with GitHub Actions for CI/CD automation

· Strong foundation in Linux architecture, security, performance tuning, troubleshooting, and production operations

· Experience working in Agile delivery teams

· Ability to collaborate effectively with multi-location/global teams

· Demonstrated ability to contribute at both a tactical and strategic level

· Familiarity with eCommerce, fulfillment, or retail technology environments

· Strong written, verbal, and presentation skills

Nice to Have

· years of experience designing or supporting high-volume eCommerce applications

· years configuring and managing cloud environments in Azure, AWS, or GCP

· Hands-on experience with Kafka, Cosmos DB, Cassandra, Ansible, Terraform, Docker, Kubernetes (1+ year)

· Experience with Nginx, HAProxy, or Squid

· Experience building CI/CD pipelines with Jenkins, Spinnaker, Azure DevOps, or TeamCity

· Experience implementing and managing RoyalTS or similar cross-platform remote management tools

Responsibilities

· Partner with application engineering, observability, and support teams — along with business operations and third-party partners — to identify, prioritize, and resolve issues impacting customer pickup and delivery operations

· Lead root cause analysis of critical business and production incidents, ensuring corrective actions and preventive measures are implemented

· Manage and facilitate Major Incident calls for the Pickup Fulfillment domain, providing timely and accurate updates to key stakeholders during service restoration

· Collaborate with engineering teams to continuously enhance build environments for improved reliability, speed, and scalability

· Drive automation initiatives to improve system efficiency, deployment accuracy, and operational quality

· Ensure system traceability, observability, and retrievability to support issue diagnosis and performance monitoring

· Build and maintain comprehensive logging, monitoring, and alerting systems to proactively identify bottlenecks and enable performance optimization across cloud, on-prem, and in-store environments

· Develop detailed documentation, playbooks, and design guides to support operational readiness and incident response consistency

· Participate in off-hours on-call rotation and scheduled maintenance windows to ensure uninterrupted system availability

Cincinnati, Ohio, United States

Sprachkenntnisse

English

Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.

Ähnliche Jobs finden