Offres d'emploi
Trouvez des postes près de chez vous, sur site, hybrides ou à distance.- Emplois similaires à : Product Engineer - Training Platform
Product Engineer - Training Platform
BasetenSan FranciscoAbout BasetenBaseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, fl
Platform Product Manager
Software Technology, Inc.San FranciscoPlatform Product Manager, StandardsAs a Platform Product Manager, Standards, you will be responsible for executing on a set of strategic priorities that uphold our client's community standards. You wi
Product Manager, Compute Platform
ColorwaveSan FranciscoProduct Manager Focused On Compute PlatformAnthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a w
Frontend Engineer, Platform
TransformcapSan FranciscoGrow Therapy is on a mission to serve as the trusted partner for therapists growing their practice, and patients accessing high-quality care. Powered by technology, we are a three-sided marketplace th
Senior Platform Engineer Backend for AI Game Platforms
Alumni VenturesSan FranciscoAlumni Ventures is seeking a Senior Software Engineer for their Platform Team in San Francisco. This role focuses on backend services for AI-powered games, requiring 5–8 years of experience in softwar
Backend Engineer (Data Platform)
blockchaincapital.comSan FranciscoPhantom is the modern money app used by tens of millions around the world. Our product combines everything people need to manage, spend, and grow their money in one simple, intuitive experience. Phant
Staff Backend Engineer (Platform)
Conversion ServicesSan FranciscoAbout Us Conversion is the AI-native marketing automation platform for modern software companies. Our platform lets growth teams run their entire go-to-market motion in one place, from acquisition thr
Frontend Engineer – AI Science Platform
Edison Scientific Inc.San FranciscoEdison Scientific Inc. in San Francisco is looking for a Frontend Engineer to design and develop our integrated research environment. This role involves creating user-friendly interfaces that integrat
Staff Backend Engineer, Genomics Platform
Radical Numerics Inc.San FranciscoRadical Numerics Inc. is looking for a Member of Technical Staff in Backend Engineering in San Francisco, California. In this role, you will design and build backend services that power the company's
Senior Identity Platform Backend Engineer
StravaSan FranciscoStrava is seeking a Senior Server Engineer to join its Identity Engineering Team responsible for secure athlete access. This backend platform engineering role emphasizes building reliable, scalable di
Frontend Engineer Distributed Systems Platform
CleraSan FranciscoAbout the Role We're looking for a frontend engineer to own the interface through which users will interact with distributed systems. This isn't just another dashboard - it's an entire operating syste
Senior Backend Engineer - Onboarding Platform
RippleSan FranciscoRipple in San Francisco is seeking a Senior Engineer to lead the rebuild of its Customer Onboarding Platform. This role involves designing cloud-native backend services that enhance customer experienc
Platform Security Architect & Lead Engineer
B CapitalSan FranciscoB Capital is seeking a Principal Engineer to lead their Endpoint Protection & Infrastructure Vulnerability Scanning team in San Francisco. This role involves driving architecture, leading software del
Founding Engineer - Platform/Infra/Devops
HRBSan FranciscoFounding Engineer (Platform/Infra/DevOps) San Francisco, CA Apply for this job "Would love someone with more experience working in and out of cloud infra, DevOps, secops, etc. Someone who wants owners
AI Platform Backend. Engineer, Capabilities
AI Chopping Block, Inc.San FranciscoAbout Brain Co. Brain Co. is an applied AI startup co-founded by Jared Kushner and Elad Gil, and backed by leading Silicon Valley builders including Patrick Collison and Andrej Karpathy. We are buildi
Software Engineer, Infrastructure - Analytics Platform
OpenAISan FranciscoAbout the Team The Scaling team designs, builds, and operates critical infrastructure that enables research at OpenAI.Our mission is simple: accelerate the progress of research towards AGI. We do this
Staff Analytics Engineer, Subledger Platform
AffirmSan FranciscoAffirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.About The Finance Team The F
Staff Software Engineer (Backend Platform)
PatreonSan FranciscoPatreon is the best place for creators to build memberships by providing exclusive access to their work and a deeper connection with their communities. We’re building a content and community platform
Senior Backend & Infrastructure Engineer AI Platform
janitorAISan FranciscojanitorAI in San Francisco is seeking an engineer to build AI interactive entertainment. The role involves improving product features deeply wired into our AI stack and enhancing performance, reliabil
Cloud-Native Backend Engineer - AI Platform
AI Chopping Block, Inc.San FranciscoAI Chopping Block, Inc. in San Francisco is seeking a backend engineer responsible for building backend services and cloud infrastructure that facilitate OpenAI’s capabilities. The role involves colla
Senior Backend Engineer, Ads Platform (Remote)
TensecSan FranciscoTensec is seeking a Senior Backend Engineer to join their Ads team. This role involves developing ad products, collaborating with various teams, and ensuring operational stability. Candidates should h
Frontend Platform Engineer - Design System Lead
TransformcapSan FranciscoTransformcap is seeking a Frontend Engineer to shape the design system and foundational web infrastructure. This role is key in enhancing performance and ensuring accessibility across our platforms. T
Principal Backend Platform Engineer – AI Discovery
Edison Scientific Inc.San FranciscoEdison Scientific Inc. in San Francisco is looking for a Principal Full-Stack Engineer (Backend-Focused) to architect and scale their AI-driven scientific discovery platform. This key role involves ba
Platform Backend Engineer LLMs, TypeScript & AWS
Neon RedwoodSan FranciscoNeon Redwood in San Francisco is looking for a Backend Engineer to join their team. In this role, you will collaborate closely with the CTO and help design, develop, and maintain APIs and web applicat
Backend Engineer - AI Analytics Platform (Remote)
HEXSan FranciscoHex is looking for an experienced backend product engineer to enhance its industry-leading AI analytics platform. In this role, you will develop integrated features that advance analytics workflows, w
Product Engineer - Training Platform
- San Francisco, California, United States
- San Francisco, California, United States
À propos
Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.
THE ROLE
We’re looking for a customer-obsessed software engineer to come ship with us. You’ll own features like multi-node training and products like serverless reinforcement learning (RL) from conception to MVP (and from MVP to GA!). You’ll work through the stack, architecting solutions from API and UI down to our infrastructure layer. You’ll fine tune models yourself to develop an understanding of user workflows. You’ll work closely with research engineers leveraging state-of-the-art training techniques to build experiences that accelerate model development and solve for real pain points. If you’re excited to dive deep into the training, let’s talk!
THE PRODUCT
Take a look at what we’ve built so far:
Overview of the product so far
Training docs overview
Story of the Training product
Research we've done
EXAMPLE INITIATIVES
Checkpointing Pipeline: Our checkpointing pipeline starts with automated checkpointing, a feature that ensures that versions of models created during training are automatically backed up to the cloud. Users are able to then deploy checkpoints seamlessly into inference servers, providing point-and-click integrations into inference frameworks like vLLM and Baseten’s Inference Stack. This enables customers to quickly evaluate the performance of their checkpoints with real traffic.
Multinode training: Multinode training enables customers to easily run training jobs across multiple compute nodes, enabling users to train large models like GLM 4.7 and DeepSeek. We’ve built deeply at the Kubernetes layer to ensure that scheduling, startup, inter-node communication, and shutdown happen seamlessly under the hood and as the user expects.
Training DX: Customers come to train on Baseten because it helps them get to value fast. To do this, we ensure that the features we ship aren’t just fast, but are easy to iterate with. We enhanced Baseten’s metrics from pod-level GPU summaries to per-GPU and per-Node. We’ve built a CLI experience that caters to terminal users, and UI experiences that enable user to seamlessly manage their training jobs.
Responsibilities
Iterate like crazy
Design ergonomic APIs and abstractions to model complex resources and lifecycles
Work throughout the stack (API layer, backend and database implementation, infra layer; frontend is a plus) to implement features.
Fine-tune and deploy models to develop intuition around training workflows.
Partner closely with model developers and world-class research engineers to understand the requirements and pain points of post-training workflows.
Drive long-term improvements to improve reliability of systems and velocity of development
Fix bugs & resolve customer issues with urgency
Requirements
5+ years experience building software applications
Deep knowledge of the web stack, databases, and distributed systems
Experience developing developer tooling or infrastructure products for external or internal users.
Good taste in product, particularly developer-oriented tools
Interest in ML/AI infrastructure and willingness to learn
Driven by high agency and ownership
Strong communication skills with the ability to bridge technical depth and business needs
NICE TO HAVE
Experience launching features and products through different release cycles (MVP, Beta, GA, etc.)
Experience with model development methods and paradigms, like Supervised Fine-Tuning, Reinforcement Learning, Synthetic Data Generation, LoRA, Full Finetunes, etc.
Familiarity or experience with the open source training stack and frameworks (NCCL, PyTorch, Megatron, NemoRL, VeRL, Axolotl, HF Trainer) and distributed training techniques (FSDP, DeepSpeed).
Experience developing AI products, tooling, or agents
Frontend fluency
Benefits
Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year\'s Day!)
Paid parental leave
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.
At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.
Compensation Range: $200K - $275K
#J-18808-Ljbffr
Compétences linguistiques
- English
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.