This job offer is no longer available
About
We're building an AI-powered sleep support chat to bring Hatch's sleep expertise to parents at scale. This isn't a thin wrapper around an LLM—it's an application that needs to understand real-world sleep scenarios, align with Hatch's guidance, handle multi-turn conversations, and plug into our device ecosystem.
Your job isn't to build models from scratch. You'll build the application layer around them: FastAPI services, Pydantic models, evaluation tooling, integrations with device APIs, and guardrails to keep responses safe and useful. This is solid engineering work on a product that matters to thousands of families.
What You'll DoWeek 1-2:
- Ramp up on the existing FastAPI service and PydanticAI agent architecture
- Review the sleep content library, device APIs, and common conversation patterns
- Make initial improvements to prompts, validation, or routing logic to stabilize quality
Within 3 Months:
- Drive implementation for Phase 2 of the AI Sleep Chat targeting Q2 2026
- Build out evaluation and testing systems to validate responses against Hatch's sleep guidance
- Work with sleep consultants to translate their knowledge to deterministic logic and model prompts
- Add instrumentation for monitoring, logging, and conversation analytics
- Improve reliability and guardrails around model-driven responses
Within 6 Months:
- Lead Natural Language Interface project: translate voice/text commands into device API calls
- Design and implement multi-agent architecture for command processing pipeline (intent → parameters → CMS lookup → device control)
- Build routine creation wizard using LLM-powered recommendations
- Establish practical engineering patterns for using LLM APIs (validation, fallbacks, tests, observability)
Technical Depth:
- 4–6+ years professional Python experience
- Strong with FastAPI, Pydantic, async programming, and building production services
- Experience integrating with LLM APIs (OpenAI, Anthropic, Bedrock, etc.)
- Comfortable with prompt design, API orchestration, and response validation
- Solid understanding of monitoring, observability, and debugging distributed systems
AI/ML Systems:
- Experience using hosted model APIs to solve domain problems
- Familiarity with intent classification, entity extraction, and multi-turn conversation flows
- Ability to build evaluation harnesses and safety checks for model output
- Knowing when deterministic logic beats model creativity
Production Engineering:
- Experience with serverless or cloud-native architectures (AWS preferred)
- Able to debug issues that involve both model output and device/control APIs
- History of shipping features under tight, evolving requirements
Working With Domain Experts
- Can translate guidance from sleep consultants into rules, prompts, and validation logic
- Able to communicate system behavior clearly to product, CX, and business teams
- Pragmatic about constraints and tradeoffs
- Need hand-holding with modern Python web frameworks
- Expecting to train custom ML models or do research work
- Haven't shipped features that rely on LLM APIs in production
- Prefer theoretical AI work over building real user-facing systems
Parents lean on Hatch when they're frustrated and exhausted at 2am. The AI can't be sloppy or vague. It needs to understand real-world sleep issues—"my 8-month-old keeps waking at 3am"—and respond safely and consistently. Your work keeps the system grounded, reliable, and aligned with proven sleep guidance. If you want to build something that actually helps families during the hardest moments, this is a good place to do it.
Languages
- English
This job was posted by one of our partners. You can view the original job source here.