Back to Jobs
XX
Machine Learning Platform EngineerRoyal Bank of CanadaUnited States

This job offer is no longer available

XX

Machine Learning Platform Engineer

Royal Bank of Canada
  • US
    United States
  • US
    United States

About

Machine Learning Platform Engineer
We're looking for an experienced Machine Learning Platform Engineer who will bring focus and subject-matter expertise around designing and implementing machine learning infrastructure and automation tools (MLOps and DevOps). This is a unique opportunity to grow in the world of machine learning infrastructure and work with a team of passionate individuals committed to the mission of bringing ML to enterprise. At RBC Borealis, you'll be joining a team that works directly with leading researchers in machine learning, has access to rich and massive datasets, and offers the computational resources to support ongoing development in areas such as reinforcement learning, unsupervised learning and computer vision. You can find out more about our research areas at rbcborealis.com. Your responsibilities include: Deploying and operating the GenAI platform across OpenShift/Kubernetes; Managing large language model deployments (Cohere Command, Llama, Mistral) on GPU infrastructure (NVIDIA A100/H100), and configuring RAG pipelines with serving frameworks like vLLM, NVIDIA NIM, and TensorRT-LLM; Monitoring GPU utilization, model performance metrics, and resource allocation across the platform; Implementing observability stacks—Prometheus, Grafana, Pushgateway, and structured logging pipelines—to surface platform health, performance, and security signals; Designing and implementing best practices and standards for data and machine learning pipelines across the organization; Supporting platform users and cross-functional teams through infrastructure design guidance, thorough documentation, and collaboration across multiple RBC locations; Building highly scalable, resilient on-premise systems for hosting machine learning systems using state-of-the-art technologies; You're our ideal candidate if you have: Strong experience designing and operating distributed/ML systems plus deep Kubernetes/OpenShift knowledge (Helm, operators, custom resources, RBAC, troubleshooting); Proven history building DevOps/CI/CD pipelines (GitHub Actions), multi-stage Docker images, registry mirroring, and infrastructure automation in restricted enterprise environments; In-depth knowledge of various stages of the machine learning application deployment process; Proficiency with programming languages such as Python, Bash, or Rust; Solid grasp of software engineering best practices—testing (unit/integration), coding standards, code reviews, source control—and implementing production monitoring, alerting; Hands-on experience building and deploying hybrid environments on-premises enterprise environments; Familiarity with the Large Language Model (LLM) inference and serving such as VLLM or similar; What's in it for you? Become part of a team that thinks progressively and works collaboratively. We care about seeing each other reach full potential; A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock options where applicable; Leaders who support your development through coaching and managing opportunities; Ability to make a difference and lasting impact from a local-to-global scale. About RBC Borealis RBC Borealis is the driving force behind Royal Bank of Canada's AI and data innovation. As part of Canada's largest financial institution, we bring together a team of architects, engineers, scientists, and product experts on a mission to revolutionize finance through world-class research, solutions, and a resilient data platform. With locations across Toronto, Waterloo, Montreal, Calgary, and Vancouver, we're at the forefront of AI research and platform development. With a focus on cutting-edge research in areas like time series forecasting, causal machine learning, and responsible AI, we are seamlessly integrating AI research and data engineering, to solve critical challenges in the financial industry. We are building intelligent, and scalable, data-driven solutions that will help communities thrive and drive innovation for our customers across the bank. Inclusion and Equal Opportunity Employment RBC is an equal opportunity employer committed to diversity and inclusion. We are pleased to consider all qualified applicants for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, protected veterans status, Aboriginal/Native American status or any other legally-protected factors. Disability-related accommodations during the application process are available upon request.
  • United States

Languages

  • English
Notice for Users

This job was posted by one of our partners. You can view the original job source here.