Back to Jobs
XX
AIOps Lead, Software EngineeringZelisUnited States

This job offer is no longer available

XX

AIOps Lead, Software Engineering

Zelis
  • US
    United States
  • US
    United States

About

AIOps Strategy Leader
This role will lead the next phase of Zelis' operational transformation as we accelerate AWS migration and expand AI-native capabilities into the operations space. You will define and drive the AIOps strategy for the organization, combining cloud operations, observability, automation, SRE practices, and AI/agentic solutions to improve reliability, incident response, operational efficiency, and platform resilience. As a senior technical leader, you will work across Engineering, Cloud, Infrastructure, Security, and Operations teams to design intelligent operational capabilities that move beyond traditional monitoring into proactive, automated, and agent-assisted operations. This role requires strong AWS depth, hands-on experience with observability platforms such as New Relic and OpenSearch, strong AI and agent experience, a strong SRE mindset, and proven ability to apply ChatOps and agentic or multi-agent systems to real-world operational workflows. What You'll Bring to Zelis · Experience & leadership: Typically BS + 12 years or MS + 10 years (or equivalent), with a strong track record leading cloud operations, platform operations, SRE, observability, or AIOps initiatives across complex enterprise environments. · AWS depth: Strong hands-on experience designing and operating workloads on AWS, with expertise across compute, networking, storage, security, automation, and cloud operations patterns. · Observability expertise: Deep experience with modern observability and monitoring platforms such as New Relic, OpenSearch, and related tools for metrics, logs, traces, dashboards, alerting, and operational analytics. · AIOps experience: Proven experience applying AI to operations use cases such as event correlation, anomaly detection, alert reduction, root cause analysis, remediation support, and operational workflow automation. · Agentic AI fluency: Strong experience designing or implementing AI agents, agentic workflows, or multi-agent systems that improve operational processes and operator effectiveness. · SRE mindset: Strong grounding in site reliability engineering principles, including service reliability, SLOs/SLIs, error budgets, automation, incident management, resilience, and continuous improvement. · ChatOps experience: Demonstrated success building or scaling ChatOps practices that improve collaboration, incident response, and operational execution through integrated messaging and workflow automation. · Automation & engineering strength: Strong knowledge of scripting, infrastructure automation, operational tooling, APIs, event-driven systems, and platform integration patterns. · Problem-solving & execution: Ability to translate operational pain points into scalable technical solutions that improve reliability, speed, and operational maturity. · Communication & influence: Able to influence technical teams and senior leaders, build alignment across functions, and communicate complex operational strategies clearly. · Governance & trust: Experience implementing operational AI responsibly with appropriate controls for accuracy, security, compliance, explainability, and human oversight.
  • United States

Languages

  • English
Notice for Users

This job was posted by one of our partners. You can view the original job source here.