Über
Analyze and optimize model training performance, improving distributed training workflows and maximizing resource utilization across various hardware environments.
Enhance system observability, debuggability, and operational excellence to uplift user experience.
Collaborate with cross-functional teams to integrate cutting-edge features and technologies into the platform.
Essential Skills & Qualifications: Bachelor's degree or higher in Computer Science or a related field, or equivalent experience.
3+ years of hands-on software engineering experience.
2+ years of specialized experience in AI/ML infrastructure, including facilitating distributed training for large ML models.
Proficient in Python, with strong expertise in frameworks such as PyTorch (preferred), TensorFlow, and others.
Familiar with distributed computing, GPU computing, and cloud platforms (AWS, GCP, Azure).
Willingness to travel to Sunnyvale, CA as needed.
Comfortable navigating ambiguous and dynamic work environments.
Preferred Qualifications: 5+ years of professional software engineering experience.
Self-motivated, with a strong focus on driving results and creating meaningful impact.
Deep knowledge and experience with PyTorch 2.x+ and distributed training frameworks.
Experience in designing training frameworks that support FSDP, Pipeline Parallelism, and other advanced techniques for training large foundational models.
Skilled in profiling, analyzing, debugging, and optimizing training and data loading performance.
Excellent communication skills to facilitate discussions, achieve consensus, manage risks, and provide constructive feedback.
Compensation: The compensation range for this position is between $170,000 and $240,000, based on relevant experience and expertise. The role also comes with incentive pay based on individual and company performance. Relocation: This position may be eligible for relocation benefits. Benefits: We offer an extensive range of health and wellness programs, including medical, dental, vision plans, Health Savings Accounts, retirement savings plans, and more.
This role is primarily remote, but individuals residing within a 50-mile radius of Mountain View, Sunnyvale, Detroit, Warren, or Milford are expected to report to one of these locations at least three times a week, as determined by management. About GM: At General Motors, our mission is to achieve Zero Crashes, Zero Emissions, and Zero Congestion, and we are committed to leading the transformation necessary to create a safer, more equitable world.
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.