Back to Jobs
XX
Backend Engineer LLM InfrastructureCloudAct Inc.Sunnyvale, California, United States
XX

Backend Engineer LLM Infrastructure

CloudAct Inc.
  • US
    Sunnyvale, California, United States
  • US
    Sunnyvale, California, United States

About

You will work on the always-in-path FastAPI proxy that sits between every customer request and every provider. That means guardrails, prompt management, A/B testing, cost attribution, rate limits, and streaming. You will own the parts of the request lifecycle that have to stay correct under real production load. Responsibilities Extend the Nemo Backend FastAPI proxy with new guardrails and features Own streaming, retries, and provider failover correctness Build and maintain cost attribution from x-nemo-request-cost Profile and tune hot paths — every millisecond is in the user-facing latency budget Harden multi-tenancy isolation at the request layer Requirements 5+ years of backend Python in production Deep experience with asyncio and high-concurrency services Comfortable with Postgres, connection pooling, and query optimization Production experience with streaming APIs or proxies Nice to have Prior work on LLM APIs, model gateways, or SSE streaming Experience with LLM routing engines or model gateways Performance profiling and flame graph literacy
#J-18808-Ljbffr
  • Sunnyvale, California, United States

Languages

  • English
Notice for Users

This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.