Machine Learning Engineer, AssessmentsSpeak LLC • San Francisco, California, United States
Machine Learning Engineer, Assessments
Speak LLC
- San Francisco, California, United States
- San Francisco, California, United States
Über
This role owns the implementation, deployment, and ongoing quality of our assessment algorithms and ML systems. While there is immediate need to improve and expand production assessments, this work is also building a platform capability that can be reused across the app.
What you’ll be doing
Ship and own assessment ML systems end-to-end
Build, deploy, and maintain scoring models/pipelines (feature extraction → model training → inference → feedback generation)
Own monitoring, regression tests, and ongoing iteration to maintain accuracy targets
Define and operationalize evaluation
Implement validation/evaluation frameworks for assessments, including metrics, test sets, and offline/online analysis
Translate assessment requirements into measurable acceptance criteria and guardrails
Partner deeply with the Assessment Design Lead
Co-develop the strategy, together with the Content team, to grow assessments into a core platform at Speak
Work in a tight weekly loop to deliver incremental improvement
Drive near-term delivery across products
Stand up or improve summative assessments (spoken language ability) and bring them reliably to production
Prototype and validate formative assessment approaches to measure improvement over weeks/months
Support data and labeling strategy
Help define data needs for training/evaluation (including psychometric measurement needs)
Build or improve pipelines that support label collection and analysis (especially for efficacy studies)
What we’re looking for
Domain expertise in spoken language proficiency assessment (linguistics, applied linguistics, pedagogy, or equivalent experience)
Strong experience designing and running evaluation + validation for assessment/scoring systems, and tailoring approaches to a specific product use case
4+ years building automatic proficiency assessment systems (or equivalent depth in closely related scoring/evaluation domains)
PhD is helpful but not required
Proven ability to ship ML models to production (not only research), including reliability, monitoring, and iteration
Strong generalist ML/analysis skills (statistics, Python, PyTorch/model training)
Ability to operate cross-functionally and communicate clearly with non-technical partners (Content/LD, PM, leadership)
Nice to have
Experience with speech/audio ML
Experience with psychometrics concepts (reliability/validity, calibration)
How we work (collaboration expectations) This role is designed to be highly collaborative with the Assessment Design Lead. Success depends on a tight loop where constructs/rubrics and model outputs co-evolve — not a sequential handoff.
Speak does not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.
#J-18808-Ljbffr
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.