Human Data Operations Lead
Besimple AI
- San Mateo, California, United States
- San Mateo, California, United States
Über
Full-time Remote (US-first)
Team:
Founding Ops & Customer Delivery
About Besimple We provide the data layer for audio models. Our mission is to bring AI into the real world naturally. We believe that AI can meaningfully empower humanity
through the most natural interface - voice . We’re a small, nimble team of passionate builders who believe humans must remain in the loop.
The Role (Founding) This is our
founding operations
role. You won’t “run a process”—you’ll
design the process, the playbooks, and the bar
for what world‑class, AI‑first data operations looks like. You’ll take ambiguous customer needs, turn them into crisp rubrics and workflows,
recruit and train bench globally , and stand up the quality systems, dashboards, and SLAs that become Besimple’s operating backbone. As we grow, you’ll scale the org you built—hiring, coaching, and evolving best practices.
You’ll use
AI coding tools
(Copilot/Cursor/Codex) and lightweight
Python/SQL
to automate processes, analyze variance/drift, and accelerate delivery. You’ll partner with customers to
define and refine annotation requirements , and with Product/Eng to shape UX, guardrails, and platform roadmap.
What You’ll Do
Own customer programs end‑to‑end: translate goals into schemas, rubrics, gold sets, and success metrics; pilot → scale with clear reporting and write‑ups.
Define & refine requirements with customers: run scoping sessions, lock criteria/edge‑case taxonomies/IAA targets; iterate as models and prompts change.
Recruit, onboard, and train annotators: source SMEs, design paid trials, build training artifacts, calibrate on gold data, and manage QA/arb loops.
Ship with AI‑accelerated ops: write quick scripts and notebooks for data transforms, audits, log parsing, schema reconciliation, and quality analytics.
Build the operating system: SLAs, sampling plans, consensus/appeals, audit trails, and continuous calibration; make quality measurable and repeatable.
Close the loop: drive prompt/model/policy experiments; surface insights to Product/Eng; propose UI tweaks and guardrails that raise signal‑to‑noise.
What Will Make You Successful
Company‑builder mindset: you’ve built 0→1 programs or teams, created playbooks, and raised the bar for quality and speed.
Customer‑facing clarity: you convert open‑ended asks into precise pass/fail criteria and aren’t afraid to propose a better spec.
People leadership: you attract, calibrate, and motivate high‑judgment annotators while holding a crisp, documented bar.
Hands‑on with data & AI tools: comfortable with AI coding assistants plus basic Python/SQL to answer questions fast and automate the dull bits.
Execution bias: you prefer small pilots and rapid iteration over lengthy specs, and you over‑communicate risks and status.
Qualifications
2–4+ years in data/product/research operations for ML/AI, relevance, or safety—or equivalent “high‑judgment at scale” experience.
Track record recruiting, onboarding, and training annotators/raters with gold‑set calibration and QA loops.
Demonstrated program ownership: requirements, change management, stakeholder updates, and post‑mortems.
Excellent writing: rubrics, edge‑case guides, SOPs, and crisp weekly reports.
Nice to Have
Trust & Safety, RLHF/RLAIF, search/relevance, or regulated domains (medical, legal, finance).
Experience designing evaluator UIs, prompt templates, or judgment tasks for LLMs/multimodal models.
Familiarity with IAA stats, sampling methods, or experiment design.
Compensation & Ownership Founding‑level role with meaningful equity and scope to
define what it means to build an AI‑first data annotation company —from playbooks and metrics to culture and hiring.
#J-18808-Ljbffr
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.