Software EngineeringAI Chopping Block, Inc. • San Francisco, California, United States

Apply Now

Software Engineering

AI Chopping Block, Inc.

San Francisco, California, United States

San Francisco, California, United States

Apply Now

About

Serve as a hands-on engineering leader, dedicating about 70% of time to individual contributor-level tasks such as coding, reviewing, debugging, and shipping. Architect and maintain scalable, reliable backend infrastructure to support rapid growth and real-time AI-powered communications. Validate and pressure-test architectural decisions across the stack to ensure scalability without over-engineering. Remove blockers to unlock engineering velocity, improve developer tooling, streamline CI/CD, and establish lightweight processes to maintain fast shipping. Enforce high code quality standards while minimizing technical debt. Lead and mentor a growing engineering team, fostering a culture of technical excellence, ownership, and iteration. Drive the technical roadmap in partnership with the founding team, balancing short-term product needs with long-term infrastructure plans. Design and implement core infrastructure supporting AI-native applications including multi-agent workflows, model routing, and real-time feedback loops. Champion AI-first engineering practices across the organization by leveraging AI coding tools, AI-native architectures, and emerging LLMOps patterns. Collaborate cross-functionally with product, design, and data to prioritize and deliver high-impact features.
Build the Intelligence layer at Sierra by working on systems that analyze millions of agent interactions and generate insights to improve agent performance and customer outcomes. Build systems for analyzing, clustering, and exploring large-scale conversational data. Design systems that allow measurement of agent quality and enable experimentation. Develop learning systems involving retrieval, ranking, personalization, and feedback loops to enhance agent effectiveness over time.
Responsibilities include designing and building agent architecture that is steerable, verifiable, conversational, and empathetic while future-proofing it as large language models evolve; developing retrieval methods to ground answers in customer knowledge bases and handle unclear or clarifying cases conversationally; measuring and empowering customer improvement of agent quality through evaluations; ensuring chat agents function effectively over the phone with lifelike conversations at low latency; creating simulation and benchmarking platforms to test AI agents against real-world scenarios; developing intuitive no-code content management tools to guide and test AI agents; adapting traditional software development methodologies to accommodate AI agents' non-deterministic behavior, natural language interactions, and reliance on large language models through Sierra's agent development lifecycle; and accelerating generative agent development using tools like Cursor and Claude Code to build self-improving systems based on interactions, feedback, and self-play.
Build the core systems that power agents including the Agent SDK such as the orchestration engine, runtime, and primitives that define how agents reason, take actions, and interact with users and systems. Design the agentic loop to build agents that are steerable, verifiable, conversational, and adaptive. Improve retrieval and grounding systems to ensure agents provide accurate and trustworthy responses by effectively retrieving and using knowledge. Build evaluation systems by designing frameworks that allow measurement and improvement of agent quality over time.
The role involves designing and building execution environments for AI agents, including sandboxing, isolation, and reproducibility. It includes developing systems for agent orchestration across multi-step, tool-using workflows and building infrastructure for running, testing, and debugging code generated by models. Responsibilities also include creating state and memory systems that allow agents to persist context across long-running tasks, optimizing tokens, latency, reliability, and cost across Codex’s production fleet, and supporting model rollouts, capacity planning, and managing tradeoffs between quality, speed, and economics to maintain a fleet of frontier agents at scale. Additionally, the job entails building shared platform capabilities that unblock product teams, partner teams, and open source Codex.
Build new LLM and instrumentation libraries for emerging LLM providers and agent frameworks, maintain and enhance existing instrumentation across Python and TypeScript ecosystems and others, drive improvements to semantic conventions and OpenTelemetry standards that define AI observability, collaborate with the global developer community through GitHub, Slack, and conferences as well as Arize PMs and solution architects, and take complex problems from ideation to completion with full ownership and accountability.
Design and implement production-grade Python services with clean architecture and strong engineering discipline. Architect scalable, distributed systems using Domain-Driven Design (DDD) principles. Integrate with external SaaS systems such as CRMs, dialers, meeting tools, and OAuth providers. Optimize performance, latency, cost, and reliability of AI-driven systems. Build and orchestrate LLM-powered agents including planning, reasoning, tool usage, and memory. Develop internal frameworks to manage agent coordination, tool execution, memory layers, and event-driven workflows. Work closely with AI engineers, product teams, and founders to transform complex sales workflows into autonomous processes. Use AI coding assistants effectively and experiment with AI-augmented development workflows.
#J-18808-Ljbffr

San Francisco, California, United States

Languages

English

Notice for Users

This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.

Apply Now