XX
Lead DevOps EngineerSavianceUnited States

Cette offre d'emploi n'est plus disponible

XX

Lead DevOps Engineer

Saviance
  • US
    United States
  • US
    United States

À propos

Role:
Lead DevOps Engineer Company/Operating Company:
AI Center of Excellence in Engineering
About the AI Center of Excellence in Engineering Are you ready for the next step in your career as a Senior Software Developer? You'll work with an elite group of technologists and create real business impact. With room for initiative, the latest technologies, and an AI-first work environment, you will actively contribute to the development of products across the full portfolio.
About the Role We're looking for an experienced
DevOps Engineer
to help build, automate, and maintain both our SaaS cloud infrastructure and on-premises client installations. You'll work closely with development teams to implement robust CI/CD pipelines, manage Kubernetes deployments, and ensure security across our microservices architecture in multiple environments, with a focus on search, AI, and vector database technologies.
Key Responsibilities
Design and evolve AI-augmented CI/CD pipelines that serve as reusable blueprints across rewrite projects-supporting multi-tenant SaaS deployments, agentic automation, and environment creation through code. Collaborate with the AI methodology team to refine automation patterns and integrate AI-driven pipeline generation, test orchestration, and telemetry collection into the rewrite process. Develop automated installation and update frameworks for hybrid and customer-managed environments, emphasizing repeatability and low-touch deployment. Manage Azure-based SaaS infrastructure, ensuring reliability, elasticity, and security across Kubernetes and containerized services. Deploy, scale, and optimize Elasticsearch and vector database clusters supporting GenAI workloads. Implement, monitor, and tune LLM and AI service deployments on Azure (OpenAI Service, Cognitive Search, model hosting). Design and maintain federated authentication and identity integration across microservices (Okta, OAuth2, and SSO patterns). Oversee PostgreSQL/MS SQL and data infrastructure, ensuring resilience, automated backup, and performance tuning for high-throughput workloads. Establish observability standards-metrics, traces, and logs-for AI and non-AI services; use insights to improve future rewrites. Embed security automation into every deployment model, enforcing Zero Trust and continuous vulnerability assessment. Partner with development and AI teams to industrialize deployment methodology, transforming learnings from each rewrite into platform-level automation improvements. Required Experience
AI-Augmented CI/CD:
Proven experience building and maintaining automated pipelines (GitHub Actions or Azure DevOps) that integrate AI-assisted code generation, testing, and deployment workflows. Version Control & Collaboration:
Deep experience with Git-based systems (GitHub, Bitbucket), including managing multi-repo architectures and enforcing branching and governance standards. Kubernetes & Cloud Infrastructure:
Advanced proficiency with Kubernetes and Helm; experienced in operating containerized microservices across multiple environments in Azure. Infrastructure as Code (IaC):
Strong knowledge of Terraform or Bicep for creating repeatable, parameterized deployment templates used across multiple rewrite projects. Authentication & IAM:
Hands-on experience implementing federated identity (Okta, Azure AD, OAuth2/OIDC) across microservices and SaaS environments. PostgreSQL & Data Layer Operations:
Skilled in tuning, scaling, and backing up PostgreSQL; familiarity with managing schema migrations in automated CI/CD contexts. Vector & Search Systems:
Operational experience with Elasticsearch and vector databases (e.g., Milvus, Pinecone, or Azure AI Search) to support AI-driven use cases. Azure AI & LLM Deployments:
Experience provisioning and managing Azure OpenAI, Cognitive Search, and other AI workloads, including model deployment and scaling. Observability & Telemetry:
Strong command of Prometheus, Grafana, and distributed tracing; ability to design observability frameworks that feed back into AI-driven optimization loops. Security by Design:
Practical application of DevSecOps, vulnerability scanning, and Zero Trust principles; automation of compliance and secret management (Vault or Azure Key Vault). Nice to Have
Experience with AI workflow orchestration and agent monitoring within build or deployment pipelines. Background in deployment automation or customer-managed installers for hybrid environments. Multi-cloud fluency (AWS, GCP, Azure). Containerization expertise with Docker and image lifecycle management. Familiarity with ingress controllers, API gateways, and service mesh solutions such as Istio or Linkerd. Strong scripting and automation skills (Bash, Python, PowerShell). Experience creating and managing Helm charts, Kustomize overlays, and GitOps-style repositories. Skilled in defining operational and quality metrics that inform continuous improvement cycles. Experience integrating code quality and security scanning tools (SonarQube, Trivy, Snyk) into CI/CD pipelines. What We're Looking For
10+ years of DevOps/SRE experience in both cloud and on-premise environments Strong background in microservices architecture Experience with Elasticsearch and modern AI infrastructure components Familiarity with vector databases (e.g., Pinecone, Milvus, Weaviate) Hands-on experience deploying LLMs on Azure AI or similar platforms Experience automating complex installation processes Strong problem-solving and communication skills Relevant certifications (e.g., CKA, AWS/Azure certifications) are a plus
  • United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre a été publiée par l’un de nos partenaires. Vous pouvez consulter l’offre originale ici.