AI Infrastructure Engineering (Cloud, DevOps)

Virtue AI

United States

United States

About

AI Infra Engineer
Location: San Francisco, CA (Onsite | Remote) Virtue AI sets the standard for advanced AI security platforms. Built on decades of foundational and award-winning research in AI security, its AI-native architecture unifies automated red-teaming, real-time multimodal guardrails, and systematic governance for enterprise apps and agents. Deploy in minutes—across any environment—to keep your AI protected and compliant. We are a well-funded, early-stage startup founded by industry veterans, and we're looking for passionate builders to join our core team. As an AI infra Engineer, you will own the reliability, scaling, automation, and operational discipline of Virtue AI's AI production systems, focusing on deployment and model serving performance. You will: Design and maintain deployment workflows for Virtue AI on major cloud providers (e.g., AWS and GCP) Own IaC (Terraform / Pulumi) for repeatable, auditable customer deployments. Package our services into secure, customer-ready deployment units (Docker, Helm, Marketplace images). Design, build, and maintain product CI/CD pipelines using GitHub Actions. Serve and optimize the LLM inference pipeline; build necessary inference APIs and routers; auto-scaling Design production-grade system observability (Metrics, logs, alerts, dashboards) using tools like Datadog, Grafana, and Prometheus. Implement secure networking (VPCs, IAM, service accounts, private endpoints, firewalling). Collaborate with product developers to align infrastructure and inference behavior with product requirements. Required Qualifications Bachelor's degree or higher in CS, CE, EE, or related field. Strong experience deploying production systems on major cloud platforms, e.g., AWS and/or GCP. Deep hands-on experience with Docker and containerized workloads, Kubernetes (EKS, GKE, or equivalent). Strong experience serving LLMs and embedding models in production. Strong hands-on experience with CI/CD (GitHub Actions required) and repository management (monorepos, release branches, tagging, rollbacks). Preferred Qualifications Experience with SGLang, vLLM, or similar inference frameworks. Strong understanding of GPU behavior (memory limits, batching, fragmentation, utilization) and experience with GPU-level optimization. Experience with model-level inference optimization (Quantization, KV-cache optimization, Speculative decoding or batching strategies) and inference kernels Startup experience: you move fast, take ownership, and fix things properly. Why Join Virtue AI Competitive salary + equity High ownership – You define how production runs Real impact – Your work directly affects customers and revenue Hard problems – Distributed systems, GPUs, scale, security Strong technical peers – Engineers who ship and debug, not just design

United States

Languages

English

Notice for Users

This job was posted by one of our partners. You can view the original job source here.

Find similar jobs