Zurück zur Stellenangebote
XX
Senior DevOps Engineer - AWS & AI PlatformU.S. Auto PartsUnited States
XX

Senior DevOps Engineer - AWS & AI Platform

U.S. Auto Parts
  • US
    United States
  • US
    United States

Über

AI-Native Engineer Opportunity
CarParts.com is the go-to eCommerce platform for auto care and maintenance. We provide drivers with quality parts at competitive prices and enable them to schedule appointments with trusted mechanics directly through our website. Using world-class design principles and the latest technologies, we deliver a fast, intuitive digital experience backed by our company-owned national distribution network. With over 1,000 employees worldwide, we are scaling rapidly, fueled by our most recent strategic partnership and $35 million investment. This positions us for the next phase of growth as we continue to empower drivers along their journey. We've built Axle - CarParts.com's domain AI platform and winner of the MACH Alliance Impact Award for Best Multi-Agent Ecosystem — and we're expanding it. This role is central to that expansion. At CarParts.com, our culture goes beyond our core values of Safety First, Customer Focused, and Commitment to Excellence. We are a performance-driven, data-focused, and fast-paced team where results matter and winning is expected. - Hungry & Hardworking: We set ambitious goals, measure progress with clear metrics, and hold ourselves accountable to deliver results. - Promote from Within: We reward top performers with opportunities for growth and advancement. - Collaborative & In-Person: We believe the best ideas and fastest execution happen face-to-face. - High Standards: We move quickly, pay attention to details, and dig deep - whether it's analyzing contracts, aggregating complex scenarios, or building clear, data-driven presentations. - No Passengers: We value grit, ownership, and the relentless pursuit of results One Exceptional Engineer. AI as the Team.
This is not a standard DevOps posting. We are looking for one unusually capable, AI-native engineer to own our entire platform engineering and SRE function — using autonomous agents, LLM-powered pipelines, and MCP-based tooling as force multipliers to do the work of a team, on-site, in close partnership with our engineering leadership. You will inherit a mature, fully containerized AWS estate (9 EKS clusters, 27 accounts, 228 Kubernetes nodes), an Akamai CDN layer managing live traffic splits, GitHub Actions + Jenkins CI/CD pipelines for a Webpack 5 micro-frontend monorepo, and an operational AI agent platform — OpsWhisperer — already in production monitoring 25 AWS accounts with a 91% autonomous resolution. Your job is to extend all of it, automate what remains manual, and be the person who makes every deployment, incident, and infrastructure change happen with speed, precision, and intelligence. Scope of Ownership
What you'll own: AWS Multi-Account Infrastructure EKS clusters across dedicated AWS accounts EC2 worker nodes via Auto Scaling Groups SQS pipelines AWS Bedrock (Claude) for AI agent workloads - Kubernetes & Containerization EKS clusters Node group mgmt Kops clusters alongside EKS Multiple environment tiers with full blast-radius isolation CI/CD & Release Management Multiple Repos GitHub Actions workflows + Jenkins pipeline management Turbo build system across multiple micro-frontend packages Canary release gating and rollback automation CDN & Traffic Management Akamai Property Manager config Phased Release Cloudlet for Canary and Production split Security, Throttling and Monitoring Jenkins-driven cache invalidation Observability & Incident Response Elastic/Kibana CloudWatch across all AWS accounts Business performance monitoring SQS backlog + pipeline health alerting On-call ownership, proactive, AI-assisted triage Non-Negotiable: The AI-Native Expectation
This is a role where AI fluency is not a bonus — it is how you do the job. We expect you to build, operate, and improve autonomous agents that handle monitoring, alerting, triage, and routine operational work. You are not just a consumer of AI tools; you are the person who builds them, deploys them into production, and iterates on them based on real operational data. You will extend OpsWhisperer(AI Platform and Observability agent), contribute to the Axle platform, build MCP servers that give agents new capabilities, and apply LLM-powered reasoning to infrastructure problems that previously required multiple humans. If you've never built an agent that runs in production unsupervised, this is not the right role. What You'll Inherit & Extend
The tech stack: Cloud & Orchestration - AWS EKS · Kubernetes · Kops · AWS Organizations · Auto Scaling Groups · AWS SQS · AWS Bedrock · CloudWatch CDN & Networking - Akamai Property Manager · Phased Release Cloudlet · Fast Purge · Content Protector CI/CD & Frontend - GitHub Actions · Jenkins · Turbo (monorepo) · Webpack 5 Module Federation · Canary / Blue-Green Deployments AI & Agentic - MCP (Model Context Protocol) · Claude API / AWS Bedrock · Azure Bot Service · Microsoft Entra ID · Operational AI Agents Observability & Data - Elastic / Kibana · BlueTriangle · Databricks · Cloudinary · New Relic Languages - Node.js / TypeScript · Python · Bash / Shell · SQL · PowerShell Requirements
What we're looking for: 10+ years of hands-on DevOps, SRE, or platform engineering experience in production AWS cloud environments. Deep AWS expertise: EKS, EC2, SQS, CloudWatch, IAM, Organizations, and multi-account architectures Strong Kubernetes skills: cluster operations, node group management, workload isolation, taints/tolerations, auto-scaling Experience with Akamai or equivalent enterprise CDN — configuration, purge operations, traffic routing rules CI/CD ownership: GitHub Actions and/or Jenkins pipeline design, monorepo build systems, release gating Production experience building or operating AI agents — LLM integration, autonomous workflow design, prompt engineering Proficiency in Node.js and/or Python for automation, tooling, and MCP server development Observability stack ownership: Elastic/Kibana, log analysis, alerting design, SLO/SLI instrumentation Comfortable owning on-call responsibility for a production e-commerce platform with significant revenue exposure Strong written and verbal communication — will interface with engineering leadership and present findings to executives Based in or willing to relocate to the Los Angeles / Long Beach area for on-site work Equal Opportunity Employer CarParts.com is an equal-opportunity employer. We enthusiastically accept our responsibility to make employment decisions without regard to race, religious creed, color, age, sex, sexual orientation, national origin, religion, marital status, medical condition, physical or mental disability, military service, pregnancy, childbirth and related medical conditions, or any other classification protected by federal, state, and local laws and ordinances. Our management is dedicated to ensuring that we fulfill this policy with respect to hiring, placement, promotion, transfer, demotion, layoff, termination, recruitment advertising, pay, and other forms of compensation, training, and general treatment during employment. The above-noted job description is not intended to describe, in detail, the multitude of tasks that may be assigned but rather to give the incumbent a general sense of the responsibilities and expectations of his/her position. As the nature of business demands change so, too, may the essential functions of this position.
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.