This job offer is no longer available
About
Operationalize Research:
Collaborate with researchers to move models from experimental checkpoints to production-ready systems. Establish patterns for large-scale training, rapid experimentation, and deployment of new architectures. Optimize Model Performance:
Profile and improve model inference for latency and throughput using quantization, pruning, distillation, and architectural refinements to ensure viable unit economics Model Acceleration:
Apply optimization techniques (TensorRT, ONNX, vLLM) to accelerate multimodal models including video diffusion, LLMs, and speech models Design Data Pipelines:
Design and implement efficient pipelines for video data ingestion, preprocessing, and training at petabyte scale using tools like Dagster and Ray. Evaluate and Iterate:
Build evaluation frameworks to measure model quality, establish benchmarks, and guide continuous improvement of model capabilities. Requirements
Production ML:
Experience deploying ML models to production. You understand common failure modes and how to address them (resource contention, OOMs, batch optimization) Deep Learning Experience:
Strong knowledge of PyTorch and modern ML architectures. Experience training and optimizing large models (transformers, diffusion models, or similar). Systems Proficiency:
Comfortable working with GPUs, debugging CUDA issues, and profiling model workloads to identify compute or memory bottlenecks. Data Engineering:
Experience building scalable data pipelines for high-bandwidth media processing and training workflows. Preferred Experience
Experience with video or audio models in research or production settings Familiarity with low-level optimization (CUDA kernels, Triton, custom operators) Knowledge of real-time ML systems and latency-critical inference Prior work with model compression techniques (quantization, distillation, pruning)
Nuance Labs Key Facts
$10M seed round backed by Accel, South Park Commons, Lightspeed, and top angels including Synthesia's former CPO. A world-class team of PhDs from MIT, UW, and Oxford with decades of industry experience at Apple and Meta, advancing real-time avatars from cutting-edge research to products used by millions. In-person collaboration, 5 days a week at Seattle HQ
Languages
- English
Notice for Users
This job was posted by one of our partners. You can view the original job source here.