Retour aux emplois
XX
Backend Engineer LLM InfrastructureCloudAct Inc.Sunnyvale, California, United States
XX

Backend Engineer LLM Infrastructure

CloudAct Inc.
  • US
    Sunnyvale, California, United States
  • US
    Sunnyvale, California, United States

À propos

You will work on the always-in-path FastAPI proxy that sits between every customer request and every provider. That means guardrails, prompt management, A/B testing, cost attribution, rate limits, and streaming. You will own the parts of the request lifecycle that have to stay correct under real production load. Responsibilities Extend the Nemo Backend FastAPI proxy with new guardrails and features Own streaming, retries, and provider failover correctness Build and maintain cost attribution from x-nemo-request-cost Profile and tune hot paths — every millisecond is in the user-facing latency budget Harden multi-tenancy isolation at the request layer Requirements 5+ years of backend Python in production Deep experience with asyncio and high-concurrency services Comfortable with Postgres, connection pooling, and query optimization Production experience with streaming APIs or proxies Nice to have Prior work on LLM APIs, model gateways, or SSE streaming Experience with LLM routing engines or model gateways Performance profiling and flame graph literacy
#J-18808-Ljbffr
  • Sunnyvale, California, United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.