XX
Reliability and Observability LeadVanguardCharlotte, North Carolina, United States
XX

Reliability and Observability Lead

Vanguard
  • US
    Charlotte, North Carolina, United States
  • US
    Charlotte, North Carolina, United States
Jetzt Bewerben

Über

At Vanguard, we are committed to delivering an exceptional client experience for all investors. The systems powering this experience operate within a complex and rapidly evolving resiliency landscape. As an Application Engineer within the ChAI (Chat & AI) team, you will contribute directly to building, enhancing, and supporting the conversational AI platform within the Chief Data & Analytics Office (Office). This platform powers voice AI agents, chatbots, and Natural Language Processing capabilities that optimize client interactions across our Personal Investor and Workplace Solutions businesses.

In this role, you will design, build, and support application-level capabilities that improve reliability, performance, and observability for AI and Generative AI workloads. You will also play a key role in
migrating the platform from on‑premises hosting to the SaaS offering
, ensuring a secure, seamless, scalable, and well‑observed transition. The role blends hands-on engineering, automated testing, resiliency design, and close collaboration with platform partners and SaaS vendors.

Responsibilities:

  • Develop, enhance, and maintain application components that improve system reliability, observability, and performance.
  • Implement application-level instrumentation and telemetry to close observability gaps and strengthen monitoring coverage.
  • Collaborate with platform, AI/ML, and infrastructure teams to evaluate system health, performance, and failure patterns.
  • Build automation and tooling that improves deployment repeatability, enhances resiliency, and reduces operational toil.
  • Develop and maintain automated testing suites and regression test beds to validate functionality, resiliency, and performance.
  • Participate in incident management, troubleshooting, and root-cause analyses, and contribute to recovery and prevention strategies.
  • Contribute to architectural discussions and design reviews, influencing decisions related to scalability, fault tolerance, and non-functional requirements.

SaaS Migration Responsibilities:

  • Support and contribute to the migration of the conversational AI platform from on‑prem to the SaaS offering.
  • Partner with vendor and Vanguard engineering teams to analyze platform gaps, data flows, integrations, security requirements, and service dependencies related to the migration.
  • Assist in the design and execution of migration test plans, including functional, resiliency, and performance validation in the SaaS environment.
  • Develop and enhance automation, telemetry, and regression tests specific to the SaaS platform.
  • Support cutover planning, environment readiness, UAT coordination, and post-migration stability monitoring.
  • Contribute to documentation, runbooks, and operational readiness deliverables for the SaaS environment.

Qualifications:

  • Minimum of eight years related experience, with at least two years of development experience.
  • Undergraduate degree or equivalent combination of training and experience. Graduate degree preferred.
  • Strong proficiency in Java or ; experience with APIs, multithreaded applications, and GraphQL.
  • Experience building automated testing frameworks (unit, integration, resiliency, performance) and maintaining regression test beds.
  • Experience with observability frameworks/tools such as OpenTelemetry, CloudWatch, Grafana, and Splunk.
  • Familiarity working with SaaS platforms, including designing integrations and implementing observability for SaaS-based products.
  • Experience with containerized and microservices architectures (e.g., Docker) and distributed systems.
  • Working knowledge of AWS networking, application services, IAM concepts, and cloud-native patterns.
  • Comfort with *nix environments, scripting, and command-line tooling.
  • Strong ability to diagnose system issues in high-throughput, mission-critical applications.
  • Excellent communication and documentation skills.
  • Experience with the platform or similar conversational AI platforms; migration or SaaS enablement experience preferred.

Special Factors
Sponsorship
Vanguard is not offering visa sponsorship for this position.

About Vanguard
At Vanguard, we don't just have a mission—we're on a mission.

To work for the long-term financial wellbeing of our clients. To lead through product and services that transform our clients' lives. To learn and develop our skills as individuals and as a team. From Malvern to Melbourne, our mission drives us forward and inspires us to be our best.

How We Work
Vanguard has implemented a hybrid working model for the majority of our crew members, designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection. We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.

  • Charlotte, North Carolina, United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.