Senior AI Backend Engineer – LLM + RAG System

FreelanceJobs

Canada

Canada

Apply Now

About

We are a U.S.-based beauty brand developing a custom mobile application.
As part of our product roadmap, we are building a production-grade, in-app AI assistant designed to deliver structured, education-based guidance within our mobile ecosystem.
We are looking for a senior-level backend engineer with proven experience building LLM-based systems in production environments.
This is not a prompt-writing task and not a simple chatbot wrapper project.
We require architectural thinking, system design capability, and real experience deploying AI services at scale.
Mobile UI development is handled by a separate team. This role is strictly backend + AI infrastructure.
Project Objective
Design and implement a secure, scalable AI backend ("Guide API") that:
Integrates with an LLM provider (e.g., OpenAI or equivalent)
Implements a robust Retrieval-Augmented Generation (RAG) system
Operates within strict safety and compliance boundaries
Produces structured, deterministic responses suitable for mobile consumption
Supports future scaling and feature expansion
This assistant will operate in a sensitive product category and must avoid diagnostic or medical positioning.
Required Experience (Mandatory)
Please do not apply unless you meet the following:
4+ years backend development experience
Proven experience building LLM-based systems beyond simple chat wrappers
Hands-on implementation of RAG architecture (retrieval pipelines + embeddings + vector search)
Experience working with vector databases (Pinecone, Weaviate, Supabase vector, or similar)
Experience implementing input/output moderation layers
Experience designing production APIs consumed by mobile or web clients
Understanding of prompt versioning, testing, and rollback strategies
Experience deploying secure production services (cloud-based)
We will request technical details about past AI systems you have built.
Scope of Work
1. Architecture & Infrastructure
Design Guide API as a standalone backend service
Server-side LLM integration
Secure key management and environment separation
Clean modular architecture
2. RAG System Design
Knowledge ingestion pipeline
Document chunking strategy
Embedding generation and indexing
Retrieval strategy tuning
Context window optimization
Latency-performance tradeoff management
3. AI Governance & Safety
Input moderation prior to model call
Output moderation post-generation
Red-flag detection logic
Strict enforcement of non-diagnostic boundaries
Structured fallback mechanisms
4. Structured Response Contract
API must return structured data:
Assistant message
Quick reply suggestions
Resource cards (FAQ, routine, product references)
Safety state flag
The output must be machine-consumable, not raw free text only.
5. Performance & Cost Management
Token usage optimization
Rate limiting
Cost monitoring
Graceful degradation strategies
Logging & observability
Technical Stack (Open to Proposal)
Preferred languages:
(TypeScript)
Python (FastAPI or similar)
Cloud:
AWS / GCP / equivalent
Vector DB:
Open to recommendation (must justify choice)
Deliverables
Production-ready Guide API
Fully implemented RAG system
Moderation and safety framework
Structured API documentation
Deployment documentation
Basic monitoring setup
Strict Requirements
No client-side LLM calls
No API keys exposed in mobile app
No hardcoded prompts without version control
Must support structured output format
Must implement logging and rollback capability
To Apply
Please provide:
A detailed description of a production LLM system you built (architecture overview required)
Your RAG implementation approach (how you structure retrieval and chunking)
Vector database you prefer and why
Estimated timeline for MVP delivery
Estimated budget range
Applications without detailed technical explanation will not be considered.
Contract duration of 1 to 3 months. with 30 hours per week.
Mandatory skills: Python, Machine Learning, Amazon Web Services, API, Database Architecture, NGINX

Canada

Languages

English

Notice for Users

This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.

Apply Now