DevOps Engineer – Site Reliability Engineer

remoterocketship

United States

United States

Über

Job Description:
Own cloud infrastructure on AWS — EC2, EKS, RDS, S3, IAM Manage Kubernetes clusters and container orchestration end-to-end Build and maintain CI/CD pipelines using GitHub Actions or similar Implement monitoring, alerting, and observability stacks (Prometheus, Grafana, or DataDog) Improve reliability, performance, and security of production systems Automate infrastructure with Terraform or similar IaC tools Debug and resolve issues across complex, distributed systems Participate in design reviews and help raise the infrastructure bar Requirements:
3–5 years in DevOps, SRE, or infrastructure engineering Strong AWS experience — EKS, EC2, RDS, S3, IAM Kubernetes — deployment, scaling, troubleshooting in production CI/CD pipelines — GitHub Actions, ArgoCD, or similar Infrastructure as Code — Terraform, Pulumi, or CDK Python or Go scripting Experience working in production environments with real users Comfort with ambiguity and ability to operate autonomously Benefits:
Competitive compensation and meaningful equity Direct impact on frontier AI model training and evaluation infrastructure Flexible, remote-friendly environment with low bureaucracy A small, high-caliber team with deep AI research expertise Health, wellness, and learning & development benefits

United States

Sprachkenntnisse

English

Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.

Ähnliche Jobs finden